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(57) Abstract 



Disclosed arc new ligands for use in a binding assay for proteases and phosphatases, which contain cysteine in their binding sites 
or as a necessary structural component for enzymatic binding. The sulfhydryl group of cysteine is the nucleophilic group in the enzyme's 
mechanistic proteolytic and hydrolytic properties. The assay can be used to determine the ability of new, unknown ligands and mixtures of 
compounds to competitively bind with the enzyme versus a known binding agent for the enzyme, e.g., a known enzyme inhibitor. By the 
use of a mutant form of the natural or native wild-type enzyme, in which serine, or another amino acid, e.g., alanine, replaces cysteine, the 
problem of interference from extraneous oxidizing and alkylating agents in the assay procedure is overcome. The interference arises because 
of oxidation or alkylation of the sulfhydry I, -SH (or -S"), in the cysteine, which then adversely affects the binding ability of the enzyme. 
Specifically disclosed is an assay for tyrosine phosphatases and cysteine proteases, including capsases and cathepsins, e.g.. Cathepsin 
K(02), utilizing scintillation proximity assay (SPA) technology. The assay has important applications in the discovery of compounds for 
the treatment and study of, for example, diabetes, immunosuppression, cancer, Alzheimer's disease and osteoporosis. The novel feature 
of the use of a mutant enzyme can be extended to its use in a wide variety of conventional colorimetric, photometric, spectrophotometries, 
radioimmunoassay and ligand-binding competitive assays. 
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TITLE OF THE INVENTION 

LIGANDS FOR PHOSPHATASE BINDING ASSAY 



5 



FIELD OF THE INVENTION 

This invention relates to the use of mutant phosphatase 
and protease enzymes in a competitive binding assay. Specific 
10 examples are the enzymes, tyrosine phosphatase and cysteine 

protease, e.g. Cathepsin K, and the assay specifically described is a 
scintillation proximity assay using a radioactive inhibitor to induce 
scintillation. 

BACKGROUND OF THE INVENTION 

15 The use of the scintillation proximity assay (SPA) to 

study enzyme binding and interactions is a new type of 
radioimmunoassay and is well known in the art. The advantage of 
SPA teclinology over more conventional radioimmunoassay or 
ligand-binding assays, is that it eliminates the need to separate 

?0 unbound ligand from bound ligand prior to ligand measurement. See 
for example, Nature, Vol, 341, pp. 167-178 entitled "Scintillation 
Proximity Assay " by N. Bosworth and P. Towers, Anal Biochem. 
Vol. 217, pp. 139-147 (1994) entitled "Biotinylated and Cysteine- 
Modified, Peptides as Useful Reagents For Studying the Inhibition of 

5 Cathepsin G" by A.M. Brown, et al., Anal. Biochem. Vol. 223, pp. 259- 
265 (1994) entitled "Direct Measurement of the Binding of RAS to 
Neurofibromin Using Scintillation Proximity Assay" by R. H. 
Skinner et al. and Anal. Biochem. Vol. 230, pp. 101-107(1995) entitled 
"Scintillation Proximity Assay to Measure Binding of Soluble 
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Fibronectin to Antibody- Captured alphasfli Integrin" by J. A. 
Pachter et at. 

The basic principle of the assay lies in the use of a solid 
support containing a scintillation agent, wherein a target enzyme is 
5 attached to the support through, e.g., a second enzyme-antienzyme 
linkage. A known tritiated or ll25 iodinated binding agent, i.e., 
radioligand inhibitor ligand for the target enzyme is utilized as a 
control, which when bound to the active site in the target enzyme, is 
in close proximity to the scintillation agent to induce a scintillation 

10 signal, e.g., photon emission, which can be measured by 

conventional scintillation/radiographic techniques. The unbound 
tritiated (hot) ligand is too far removed from the scintillation agent to 
cause an interfering measurable scintillation signal and therefore 
does not need to be separated, e.g., filtration, as in conventional 

15 ligand-binding assays. 

The binding of an unknown or potential new ligand 
(cold, being non-radioactive) can then be determined in a competitive 
assay versus the known radioligand, by measuring the resulting 
change in the scintillation signal which will significantly decrease 

20 when the unknown ligand also possesses good binding properties. 

However, a problem arises when utilizing a target 
enzyme containing a cysteine group, having a free thiol linkage, - 
SH,(or present as -S~ ) which is in the active site region or is closely 
associated with the active site and is important for enzyme-ligand 

25 binding. If the unknown ligand or mixture, e.g. natural product 
extracts, human body fluids, cellular fluids, etc. contain reagents 
which can alkylate, oxidize or chemically interfere with the cysteine 
thiol group such that normal enzyme-ligand binding is disrupted, 
then false readings will occur in the assay. 

30 What is needed in the art is a method to circumvent and 

avoid the problem of cysteine interference in the scintillation 
proximity assay (SPA) procedure in enzyme binding studies. 
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SUMMARY OF THE INVENTION 

We have discovered that by substituting serine for 

cysteine in a target enzyme, where the cysteine plays an active role in 

the wild-type enzyme-natural ligand binding process, usually as the 
5 catalytic nucleophile in the active binding site, a mutant is formed 

which can be successfully employed in a scintillation proximity assay 

without any active site cysteine interference. 

This discovery can be utilized for any enzyme which 

contains cysteine groups important or essential for binding and/or 
10 catalytic activity as proteases or hydrolases and includes 

phosphatases, e.g., tyrosine phosphatases and proteases, e.g. 

cysteine proteases, including the cathepsins, i.e., Cathepsin K (02) 

and the capsases. 

Further, use of the mutant enzyme is not limited to the 
15 scintillation proximity assay, but can be used in a wide variety of 

known assays including colorimetric, spectrophotometric, ligand- 

binding assays, radioimmunoassays and the like. 

We have furthermore discovered a new method of 

amplifying the effect of a binding agent ligand, e.g., radioactive 
20 inhibitor, useful in the assay by replacing two or more 

phosphotyrosine residues with 4-phosphono(difluoromethyl) 

phenylalanine (F2Pmp) moieties. The resulting inhibitor exhibits a 

greater and more hydrolytically stable binding affinity for the target 
enzyme and a stronger scintillation signal. 
25 By this invention there is provided a process for 

determining the binding ability of a ligand to a cysteine-containing 
wild-type enzyme comprising the steps of: 

(a) contacting a complex with the ligand, the complex 
comprising a mutant form of the wild-type enzyme, 
30 in which cysteine, at the active site, is replaced 

with serine, in the presence of a known binding 
agent for the mutant enzyme, wherein the binding 
agent is capable of binding with the mutant 
enzyme to produce a measurable signal. 

35 
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Further provided is a process for determining the 
binding ability of a ligand, preferably a non-radioactive (cold) ligand, 
to an active site cysteine-containing wild-type tyrosine phosphatase 
comprising the steps of: 
5 (a) contacting a complex with the ligand, the complex 

comprising a mutant form of the wild-type enzyme, 
the mutant enzyme being PTP1B, containing the 
same amino acid sequence 1-320 as the wild type 
enzyme,except at position 215, in which cysteine is 
10 replaced with serine in the mutant enzyme, in the 

presence of a known radioligand binding agent for 
the mutant enzyme, wherein the binding agent is 
capable of binding with the mutant enzyme to 
produce a measurable beta radiation-induced 
15 scintillation signal. 

Also provided is a new class of peptide binding agents selected 
from the group consisting of: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
20 phosphono(difluoromethyl)]-L-phenylalanineamide (BzN-EJJ-CONH2), where 

E is glutamic acid and J is 4-phosphono(difluoro-methyl)l-L-phenylalanyl; 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 

phosphono(difluoromethyl)]-L-phenylalanine amide; 

N-Acetyl-L-glutamyI-[4-phosphono(difluoromethyl)|-L-phenylalanyl-[4- 
25 phosphono(difluoromethyl)]-L-phenylalanine amide; 

L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

(difluoromethyl)J-L-phenylalanine amide; 

L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-f4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 
30 L-Serinyl-f4-phosphono(difluoromethyl)]-L-phenylalanyl-r4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 

L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenyIalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; and 

L-Isoieucinyl-[4-phosphono(difluoromethyl)l-L-phenylalanyl-r4-phosphono- 
35 (difluoromethyl)]-L-phenyla!anine amide; and their tritiated and 1*25 jodinated 
derivatives. 
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Further provided is a novel tritiated peptide, tritiated 
BzN-EJJ-CONH2, being N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanineamide, wherein E as used herein 
5 is glutamic acid and J, as used herein, is the (F2Pmp) moiety, 
(4-phosphono(difluoromethyl)-phenylalanyl). 

Furthermore there is provided a process for increasing 
the binding affinity of a ligand for a tyrosine phosphatase or cysteine 
protease comprising introducing into the ligand two or more 4- 
10 phosphono(difluoromethyl)-phenylalanine groups; also provided is 
the resulting disubstituted ligand. 

In addition there is provided a complex comprised of: 

(a) a mutant form of a wild-type enzyme, in which 
cysteine, necessary for activity in the active site, is 

15 replaced with serine and is attached to: 

(b) a solid support. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIGURE 1 illustrates the main elements of the invention 

20 including the scintillation agent 1, the supporting (fluomicrosphere) 
bead 5, the surface binding Protein A 10, the linking anti-GST 
enzyme 15, the fused enzyme construct 2Q, the GST enzyme 25, the 
mutant enzyme 30, the tritiated peptide inhibitor 35, the beta 
radiation emission 40 from the radioactive peptide inhibitor 35 and 

25 the emitted light 45 from the induced scintillation. 

FIGURE 2 (A and B) illustrates the DNA and amino acid 
sequences for PTP1B tyrosine phosphatase enzyme, truncated to 
amino acid positions 1-320. (Active site cysteine at position 215 is in 
30 bold and underlined). 

FIGURE 3 (A, B and C) illustrates the DNA and amino 
acid sequences for Cathepsin K. The upper nucleotide sequence 
represents the cathepsin K cDNA sequence which encodes the 
35 cathepsin K preproenzyme (indicated by the corresponding three 

letter amino acid codes). Numbering indicates the cDNA nucleotide 



-5- 



WO 98/20024 



PCT/CA97/00824 



position. The underlined amino acid is the active site Cys residue 
that was mutated to either Ser or Ala. 

FIGURE 4 (A and B) illustrates the DNA and amino acid 

5 sequences for the capsase, apopain. The upper nucleotide sequence 

represents the apopain (CPP32) cDNA sequence which encodes the 

apopain proenzyme (indicated by the corresponding three letter 

amino acid codes). Numbering indicates the cDNA nucleotide 

1 fiS 

position. The underlined amino acid is the active site Cys residue 
10 that was mutated to Ser. 

DETAILED DESCRIPTION OF THE INVENTION 

The theory underlying the main embodiment of the 
invention can be readily seen and understood by reference to 
15 FIGURE 1. 

Scintillation agent 1 is incorporated into small (yttrium 
silicate or PVT fluomicro-spheres, AMERSHAM) beads 5 that 
contain on their surface immunosorbent protein A ID. The protein A 
coated bead 5 binds the GST fused enzyme construct 20, containing 

20 GST enzyme 25 and PTP1B mutant enzyme 30, via anti-GST enzyme 
antibody 15. When the radioactive e.g., tritiated, peptide 35 is bound 
to the mutant phosphatase enzyme 30, it is in close enough proximity 
to the bead 5 for its beta emission 40 (or Auger electron emission in 
the case of 1*25) to stimulate the scintillation agent 1 to emit light 

25 (photon emission) 45. This light 45 is measured as counts in a beta 
plate counter. When the tritiated peptide 35 is unbound it is too 
distant from the scintillation agent 1 and the energy is dissipated 
before reaching the bead 5, resulting in low measured counts. Non- 
radioactive ligands which compete with the tritiated peptide 35 for the 

30 same binding site on the mutant phosphatase enzyme 30 will remove 
and/or replace the tritiated peptide 35 from the mutant enzyme 30 
resulting in lower counts from the uncompeted peptide control. By 
varying the concentration of the unknown ligand and measuring the 
resulting lower counts, the inhibition at 50%(IC50) for ligand binding 

35 to the mutant enzyme 30 can be obtained. This then is a measure of 
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the binding ability of the ligand to the mutant enzyme and the wild- 
type enzyme. 

The term "complex" as used herein refers to the 
assembly containing the mutant enzyme. In its simplest 
5 embodiment, the complex is a solid support with the mutant enzyme 
attached to the surface of the support. A linker can also be employed. 
As illustrated in FIGURE 1, the complex can further comprise a bead 
(fluopolymer), anti-enzyme GST/enzyme GST-mutant enzyme-PTPl 
linking construct, immunosorbent protein A, and scintillation agent. 
10 In general, the complex requires a solid support (beads, 

immunoassay column of e.g., AI2O3, or silica gel) to which the 

mutant enzyme can be anchored or tethered by attachment through a 
suitable linker, e.g., an immunosorbent (e.g, Protein A, Protein G, 
anti-mouse, anti-rabbit, anti-sheep) and a linking assembly, 
15 including an enzyme/anti-enzyme construct attached to the solid 
support. 

The term "cysteine-containing wild-type enzyme", as 
used herein, includes all native or natural enzymes, e.g., 
phosphatases, cysteine proteases, which contain cysteine in the 

20 active site as the active nucleophile, or contain cysteine clearly 

associated with the active site that is important in binding activity. 

The term "binding agent" as used herein includes all 
ligands (compounds) which are known to be able to bind with the 
wild- type enzyme and usually act as enzyme inhibitors. The binding 

25 agent carries a signal producing agent , e.g., radionuclide, to initiate 
the measurable signal. In the SPA assay the binding agent is a 
radioligand. 

The term "measurable signal" as used herein includes 
any type of generated signal, e.g., radioactive, colorimetric, 

30 photometric, spectrophotometry, scintillation, which is produced 

when binding of the radioligand binding agent to the mutant enzyme. 

The present invention assay further overcomes problems 
encountered in the past, where compounds were evaluated by their 
ability to affect the reaction rate of the enzyme in the phosphatase 

35 activity assay. However this did not give direct evidence that 

compounds were actually binding at the active site of the enzyme. 
The herein described invention binding assay using a substrate 
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analog can determine directly whether the mixtures of natural 
products can irreversibly modify the active site cysteine in the target 
enzyme resulting in inhibition of the enzymatic activity. To overcome 
inhibition by these contaminates in the phosphatase assay, a mutated 
5 Cys(215) to Ser(215) form of the tyrosine phosphatase PTP1B was 

cloned and expressed resulting in a catalytically inactive enzyme. In 
general, replacement of cysteine by serine will lead to a catalytically 
inactive or substantially reduced activity mutant enzyme. 

10 PTP1B is the first protein tyrosine phosphatase to be 

purified to near homogeneity {Tonks et aL JBC 263, 6731-6737 (1988)} 
and sequenced by Charbonneau et aL PNAS 85, 7182-7186 (1988). The 
sequence of the enzyme showed substantial homology to a duplicated 
domain of an abundant protein present in hematopoietic cells 

15 variously referred to as LCA or CD45. This protein was shown to 

possess tyrosine phosphatase activity {Tonks et aL Biochemistry 27, 
8695-8701 (1988)}. Protein tyrosine phosphatases have been known to 
be sensitive to thiol oxidizing agents and alignment of the sequence of 
PTP1B with subsequently cloned Drosophila and mammalian 

20 tyrosine phosphatases pointed to the conservation of a Cysteine 

residue {(M. Strueli et aL Proc. Nat'l Acad USA, Vol. 86, pp. 8698-7602 
(1989)} which when mutated to Ser inactivated the catalytic activity of 
the enzymes. Guan et a/.(1991) {J.B.C. Vol. 266, 17926-17030, 1991) 
cloned the rat homologue of PTP1B, expressed a truncated version of 

25 the protein in bacteria, purified and showed the Cys at position 215 is 
the active site residue. Mutation of the Cys^lS to Ser^lS resulted in 
loss of catalytic activity. Human PTP1B was cloned by Chernoff et aL 
Proc. Natl. Acad. Sci. USA 87, 2735-2739 (1990). 

Work leading up to the development of the substrate 

30 analog BzN-EJJ-CONH2 for PTP1B was published by T. Burke et aL 

Biochem. Biophys. Res. Comm. 205, pp. 129-134 (1994) with the 
synthesis of the hexamer peptide containing the phosphotyrosyl 
mimetic F2Pmp. We have incorporated the (F2Pmp) moiety (4- 

phosphono-(difluoromethyl)phenylalanyl) into various peptides that 
35 led to the discovery of BzN-EJJ-CONH2> (where E is glutamic acid 

and J as used herein is the F2Pmp moiety) an active (5 nM) inhibitor 
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of PTP1B. This was subsequently tritiated giving the radioactive 
substrate analog required for the binding assay. 

The mutated enzyme, as the truncated version, 
containing amino acids 1-320 (see FIGURE 2), has been demonstrated 
5 to bind the substrate analog Bz-NE J J-CONH2 with high affinity for 
the first time. The mutated enzyme is less sensitive to oxidizing 
agents than the wild-type enzyme and provides an opportunity to 
identify novel inhibitors for this family of enzymes. The use of a 
mutated enzyme to eliminate interfering contaminates during drug 

10 screening is not restricted to the tyrosine phosphatases and can be 
used for other enzyme binding assays as well. 

Other binding assays exist in the art in which the basic 
principle of this invention can be utilized, namely, using a mutant 
enzyme in which an important and reactive cysteine important for 

15 activity can modified to serine (or a less reactive amino acid) and 

render the enzyme more stable to cysteine modifying reagents, such 
as alkylating and oxidizing agents. These other ligand-binding 
assays include, for example, colorimetric and spectrophotometric 
assays, e.g. measurement of produced color or fluorescence, 

20 phosphorescence (e.g. ELISA, solid absorbant assays) and other 

radioimmunoassays in which short or long wave light radiation is 
produced, including ultraviolet and gamma radiation). 

Further, the scintillation proximity assay can also be 
practiced without the fluopolymer support beads (AMERSHAM) as 

25 illustrated in FIGURE 1. For example, Scintistrips® are 

commercially available (Wallac Oy, Finland) and can also be 
employed as the scintillant-containing solid support for the mutant 
enzyme complex as well as other solid supports which are 
conventional in the art. 

30 The invention assay described herein is applicable to a 

variety of cysteine-containing enzymes including protein 
phosphatases, proteases, lipases, hydrolases, and the like. > 

The cysteine to serine transformation in the target 
enzyme can readily be accomplished by analogous use of the 

35 molecular cloning technique for Cys^lS to Ser215 described in the 
below-cited reference by M. Strueli et al., for PTP1B and is hereby 
incorporated by reference for this particular purpose. 

-9- 
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A particularly useful class of phosphatases is the 
tyrosine phosphatases since they are important in cell function. 
Examples of this class are: PTP1B, LCA, LAR, DLAR, DPTP(See 
Strueli et al., below). Ligands discovered by this assay using, for 
5 example, PTP1B can be useful, for example, in the treatment of 
diabetes and immunosuppression. 

A useful species is PTP1B, described in Proc. Nat'l Acad 
USA, Vol. 86, pp. 8698-7602 by M. Strueli et aL and Proc. Nat'l Acad 
ScL USA, Vol 87, pp. 2735-2739 by J. Chernoff et aL 
10 Another useful class of enzymes is the proteases, 

including cysteine proteases (thiol proteases), cathepsins and 
capsases. 

The cathepsin class of cysteine proteases is important 
since Cathepsin K (also termed Cathepsin 02, see Biol. Chem. Hoppe- 

15 Seyler, Vol. 376 pp. 379-384, June 1995 by D. Bromme et aL) is 
primarily expressed in human osteoclasts and therefore this 
invention assay is useful in the study and treatment of osteoporosis. 
See US Patent 5,501,969 (1996) to Human Genome Sciences for the 
sequence, cloning and isolation of Cathepsin K (02). See also J. Biol. 

20 Chem. Vol. 271, No. 21, pp. 12511-12516 (1996) by F. Drake et aL and 

Biol. Chem. Hoppe-Seyler, Vol. 376, pp. 379-384(1985) by D. Bromme et 
al. y supra. 

Examples of the cathepsins include Cathepsin B, 
Cathepsin G, Cathepsin J, Cathepsin K(02), Cathesin L, Cathepsin 
25 M, Cathepsin S. 

The capsase family of cysteine proteases are other 
examples where the SPA technology and the use of mutated enzymes 
can be used to determine the ability of unknown compounds and 
mixtures of compounds to compete with a radioactive inhibitor of the 
30 enzyme. An active site mutant of Human Apopain CPP32 (capsase-3) 
has been prepared. The active site thiol mutated enzymes are less 
sensitive to oxidizing agents and provide an opportunity to identify 
novel inhibitors for this family of enzymes. 

Examples of the capsase family include: capsase- l(ICE), 
35 capsase-2 (ICH-1), capsase-3 (CPP32, human apopain, Yama), 

capsase-4(ICE re l-ll, TX, ICH-2), capsase-5(ICE r el-Hl, TY), capsase- 

- 10- 
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6(Mch2), capsase-7(Mch3, ICE-LAP3, CMH-1), capsase-8(FLICE, 
MACH, Mch5), capsase-9 (ICE-LAP6, Mch6) and capsase-10(Mch4). 

Substitution of the cysteine by serine (or by any other 
amino acid which lowers the activity to oxidizing and alkylating 

5 agents, e.g., alanine) does not alter the binding ability of the mutant 
enzyme to natural ligands. The degree of binding, i.e., binding 
constant, may be increased or decreased. The catalytic activity of the 
mutant enzyme will, however, be substantially decreased or even 
completely eliminated. Thus, natural and synthetic ligands which 

10 bind to the natural wild-type enzyme will also bind to the mutant 
enzyme. 

Substitution by serine for cysteine also leads to the 
mutant enzyme which has the same qualititative binding ability as 
the natural enzyme but is significantly reduced in catalytically 

15 activity. Thus, this invention assay is actually measuring the true 
binding ability of the test ligand. 

The test ligand described herein is a new ligand 
potentially useful in drug screening purposes and its mode of action 
is to generally function as an inhibitor for the enzyme. 

20 The binding agent usually is a known ligand used as a 

control and is capable of binding to the natural wild-type enzyme and 
the mutant enzyme employed in the assay and is usually chosen as a 
known peptide inhibitor for the enzyme. 

The binding agent also contains a known signal- 

25 producing agent to cause or induce the signal in the assay and can be 
an agent inducing e.g., phosphorescence or fluorescence (ELISA), 
color reaction or a scintillation signal. 

In the instant embodiment, where the assay is a 
scintillation assay, the signal agent is a radionuclide, i.e., tritium, 

30 l!25 ? w hich induces the scintillant in the solid support to emit 
measurable light radiation, i.e., photon emission, which can be 
measured by using conventional scintillation and beta radiation 
counters. 

We have also discovered that introducing two or more 4- 
35 phosphonodifluoromethyl phenylalanine (F2Pmp) groups into a 

known binding agent greatly enhances the binding affinity of the 
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binding agent to the enzyme and improves its stability by rendering 
the resulting complex less susceptible to hydrolytic cleavage. 

A method for introducing one F2Pmp moiety into a 

ligand is known in the art and is described in detail in Biochem. 
5 Biophys. Res. Comm. Vol. 204, pp. 129-134 (1994) hereby incorporated 
by reference for this particular purpose. 

As a result of this technology we discovered a new class 
of ligands having extremely good binding affinity for PTP1B. These 
include: 

10 N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl- 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Glutamyl-[4-phosphonoCdifluoromethyl)]-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 

20 L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
L-Isoleucinyl-[4-phosphonofdifluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide. 

25 A useful ligand in the series is Bz-NEJJ-CONH2, whose chemical 
name is: N-Benzoyl-L-glutamyl-[4-phosphono(difluoro-methyl)]-L- 
phenylalanyl-[4-phosphono(di£luoromethyl)]-L-phenyl-alanineamide, 
and its tritiated form, N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 

30 (dilfuoromethyl)]-L-phenylalanineamide. 

Synthesis of both cold and hot ligands is described in the 

Examples. 

The following Examples are illustrative of carrying out 
the invention and should not be construed as being limitations on the 
35 scope or spirit of the instant invention. 
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EXAMPLES 

1. Preparation of PTP1B Truncate (Amino Acid Sequence from 1-320 
and Fused GST-PTP1B Construct 

An E. coli culture carrying a PET plasmid expressing 
5 the full length PTP1B protein was disclosed in J. Chernoff et al, Proc 
Natl Acad. ScL USA , 87, pp. 2735-2739, (1990). This was modified to 
a truncated PTP1B enzyme complex containing the active site with 
amino acids 1-320 inclusive, by the following procedure: 

The full length human PTP-1B cDNA sequence 

10 (published in J. Chernoff et al., PNAS, USA, supra) cloned 

into a PET vector was obtained from Dr. Raymond Erickson (Harvard 
University). The PTP-1B cDNA sequence encoding amino acids 1-320 
(Seq. ID No. 1) was amplified by PCR using the full length sequence 
as template. The 5' primer used for the amplification included a 

15 Bam HI site at the 5' end and the 3' primer had an Eco RI site at the 
3' end. The amplified fragment was cloned into pCR2 (Invitrogen) 
and sequenced to insure that no sequence errors had been introduced 
by Taq polymerase during the amplification. This sequence was 
released from pCR2 by a Bam HI/Eco RI digest and the PTP-1B cDNA 

20 fragment ligated into the GST fusion vector pGEX-2T (Pharmacia) 
that had been digested with the same enzymes. The GST-PTP-1B 
fusion protein expressed in E. Coli has an active protein tyrosine 
phosphatase activity. This same 1-320 PTP-1B sequence (Seq. ID No. 
1) was then cloned into the expression vector pFLAG-2, where FLAG 

25 is the octa-peptide AspTyrLysAspAspAspAspLys. This was done by 
releasing the PTP-1B sequence from the pGEX-2T vector by Nco I/Eco 
RI digest, filling in the ends of this fragment by Klenow and blunt- 
end fixating into the blunted Eco RI site of pFLAG2. Site-directed 
mutagenesis was performed on pFLAG2-PTP-lB plasmid using the 

30 Chameleon (Stratagene) double-stranded mutagenesis kit from 
Stratagene, to replaced the active-site Cys-215 with serine. The 
mutagenesis was carried out essentially as described by the 
manufacturer and mutants identifed by DNA sequencing. The 
FLAG-PTP-1B Cys215Ser mutant (Seq. ID No. 7) was expressed, 

35 purified and found not to have any phosphatase activity. The GST- 
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PTP-1B Cys 215 Ser mutant was made using the mutated Cys 215 Ser 
sequence of PTP-1B already cloned into pFLAG2, as follows. The 
pFLAG2- PTP-1B Cys 215 Ser plasmid (Seq. ID No. 7) was digested 
with Sal I (3' end of PTP-1B sequence), filled in using Klenow 

5 polymerase (New England Biolabs), the enzymes were heat 

inactivated and the DNA redigested with Bgl II. The 500 bp 3' PTP-1B 
cDNA fragment which is released and contains the mutated active 
site was recovered. The pGEX-2T-PTP-lB plasmid was digested with 
Eco RI (3' end of PTP-1B sequence), filled in by Klenow, 

10 phenol/chloroform extracted and ethanol precipitated. This DNA 
was then digested with Bgl II, producing two DNA fragments a 500 
bp 3' PTP-1B cDNA fragment that contains the active site and a 5.5 Kb 
fragment containing the pGEX-2T vector plus the 5* end of PTP-1B. 
The 5.5 Kb pGEX-2T 5' PTP-1B fragment was recovered and ligated 

15 with the 500 bp Bgl II/Sal I fragment containing the mutated active 
site. The ligation was transformed into bacteria (type DH5ot, G) and 
clones containing the mutated active site sequence identified by 
sequencing. The GST-PTP-1B Cys 215 Ser mutant was overexpressed, 
purified and found not to have any phosphatase activity, 

20 

2. Preparation of Tritiated Bz-NEJJ-CONHg 

This compound can be prepared as outlined in Scheme 1, 
below, and by following the procedures: 

25 Synthesis of N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L- 
phenylalanyl-[4-phosphono(difluoromethyl)]-L-phenylalanineamide 
(BzN-EJJ-CONHs) 

1.0 g of TentaGel® S RAM resin (RAPP polymer, - 0.2 
mmol/g) as represented by the shaded bead in Scheme 1, was treated 
30 with piperidine (3 mL) in DMF (5 mL) for 30 min. The resin 
(symbolized by the circular P, containing the remainder of the 
organic molecule except the amino group) was washed successively 
with DMF (3 x 10 mL) and CH2CI2 (10 mL) and air dried. A solution 

of DMF (5 mL), N°°-Fmoc-4-[diethylphosphono-(difluoromethyl)]-L- 
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phenylalanine (350 mg) , where Fmoc is 9-fluorenylmethoxycarbonyl, 
and 0-(7-azabenzotriazol-l-yl)-l > l,3,3-tetramethyluranium 
hexafluorphosphate, (acronym being HATU, 228 mg) was treated 
with diisopropyl-ethylamine (0.21 mL) and, after 15 min., was added 
5 to the resin in 3 mL of DMF. After 1 h, the resin was washed 

successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

The sequence was repeated two times, first using N°°-Fmoc-4- 
[diethylphosphono-(difluoromethyl)]-L-phenylalamine and then 
using N-Fmoc-L-glutamic acid gamma-i-butyl ester. After the final 
10 coupling, the resin bound tripeptide was treated with a mixture of 
piperidine (3 mL) in DMF (5mL) for 30 min. and was then washed 
successively with DMF (3x10 mL) and CH2CI2 (10 mL) and air dried. 

To a solution of benzoic acid (61 mg) and HATU (190 mg) 
in DMF (1 mL) was added diisopropylethylamine (0.17 mL) and, after 
15 15 min. the mixture was added to a portion of the resin prepared 
above (290 mg) in 1 mL DMF. After 90 min. the resin was washed 
successively with DMF (3 x 10 mL) and CH2CI2 (10 mL) and air dried. 

The resin was treated with 2 mL of a mixture of TFA: water (9:1) and 
0.05 mL of triisopropylsilane (TIPS-H) for 1 h. The resin was filtered 

20 off and the filtrate was diluted with water (2 mL) and concentrated in 
vacuo at 35°C. The residue was treated with 2.5 mL of a mixture of 
TFA:DMS:TMSOTf (5:3:1) and 0.05 mL of TIPS-H, and stirred at 25°C 
for 15 h. (TFA is trifluoroacetic acid, DMS is dimethyl sulfate, 
TMSOTf is trimethylsilyl trifluoromethanesulfonate), 

25 The desired tripeptide, the title compound, was purified 

by reverse phase HPLC (C18 column, 25 x 100 mm) using a mobile 
phase gradient from 0.2% TFA in water to 50/50 acetonitrile/0.2% 
TFA in water over 40 min. and monitoring at 230 nm. The fraction 
eluting at approximately 14.3 min. was collected, concentrated and 

30 lyophylized to yield the title compound as a white foam. 
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Synthesis of N-(3,5-Ditritio)benzoyl-L-glutamyl-[4-phosphono(difluoro- 
methyl)]-L-phenylalanyl-[4-phosphono(dilfuorom 

alanineamide 

The above procedure described for the preparation of 
5 BzN-EJJ-CONH2 was repeated, but substituting 3,5-dibromobenzoic 

acid for benzoic acid. After HPLC purification as before, except using 
a gradient over 30 min. and collecting the fraction at approximately 
18.3 min., the dibromo containing tripeptide was obtained as a white 
foam. 

10 A portion of this material (2 mg) was dissolved in 

methanol/triethylamine (0.5 mL, 4/1), 10% Pd-C (2 mg) was added, 
and the mixture stirred under an atmosphere of tritium gas for 24 h. 
The mixture was filtered through celite, washing with methanol and 
the filtrate was concentrated. The title compound was obtained after 

15 purification by semi-preparative HPLC using a C18 column and an 

isocratic mobile phase of acetonitrile/0.2% TFA in water (15:100). The 
fraction eluting at approximately 5 min. was collected and 
concentrated in vacuo. The title compound was dissolved in 10 mL of 
methanol/water (9:1) to provide a 0.1 mg/mL solution of specific 

20 activity 39.4 Ci/mmol. 
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SCHEME 1 




OMe 

TentaGel® S RAM polymer 

H0 2 C^,NHFmoc 



PO(OEt) 2 
F F 



HATU, (/-Pr) 2 NEt, DMF 
2. piperidine, DMF 

O 




PO(OEt) 2 



F F 
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SCHEME 1 CONT'D 




HATU, (APr) 2 NEt, DMF 
2. piperidine, DMF 
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SCHEME 1 CONT'D 



1. TFA-H 2 0 (9:1) 

2. TFA-DMS-TMSOTf-TIPSH 

3. HPLC purification 



4. for X = Br: T 2 (g), 10% Pd-C 
MeOH, Et 3 N; 
HPLC purification 




PO(OH) 2 



F F 

X = H or T 



By following the above described procedure for BzN-EJJ- 
CONH2, the following other peptide inhibitors were also similarly 

5 prepared: 

N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenyl- 
alanyl-[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl- 
[4-phosphono(difluoromethyl)]-L-phenylalanine amide, 

10 L-Glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl )]-L-phenylalanine amide, 
L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, 
L-Serinyl-[4-phosphonofdifluoromethyl)]-L-phenylalanyl-[4- 

15 phosphono(difluoromethyl)]-L-phenylalanine amide, 

L-Prolinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide, and 
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L-Isoleucinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide. 

4. Phosphatase Assay Protocol 

Materials: 

EDTA - ethylenediaminetetraacetic acid (Sigma) 

DMH - NjN'-dimethyl-NjN'-bisfmercaptoacetyl)- 
hydrazine (synthesis published in J. Org. Chem. 56, pp. 2332- 
2337,(1991) by R. Singh and G.M. Whitesides and can be substituted 
with DTT - dithiothreitol Bistris - 2,2-bis(hydroxymethyl)2,2',2 n - 
nitrilotriethanol-(Sigma) Triton X-100 - octylphenolpoly(ethylene- 
glycolether) 10 (Pierce) 

Antibody: Anti-glutathione S-transferase rabbit (H and 
L) fraction (Molecular Probes) 

Enzyme: Human recombinant PTP1B, containing 
amino acids 1-320, (Seq. ID No. 1) fused to GST enzyme (glutathione 
S-transferase) purified by affinity chromatography. Wild type (Seq. 
ID No. 1) contains active site cysteine(215), whereas mutant (Seq. ID 
No. 7) contains active site serine(215). 

Tritiated peptide: Bz-NEJJ-CONH2, Mwt. 808, empirical 
formula, C32H32T2O12P2F4 

Stock Solutions 



(10X) Assay Buffer 



500 mM Bistris (Sigma), pH 6.2 
MW=209.2 



20mM EDTA (GIBCO/BRL) 
Store at 4° C. 



Prepare fresh daily: 



Assay Buffer (IX) 
(room temp.) 



50 mM Bistris 

2 mM EDTA 

5 mM DMH (MW=208) 
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Enzyme Dilution 

Buffer (keep on ice) 50 mM Bistris 

2 mM EDTA 
5 mMDMH 

5 20% Glycerol (Sigma) 

0.01 mg/ml Triton X-100 (Pierce) 

Antibody Dilution 

Buffer (keep on ice) 50 mM Bistris 

10 2 mM EDTA 

IC50 Binding Assay Protocol: 

Compounds (ligands) which potentially inhibit the 
binding of a radioactive ligand to the specific phosphatase are 
15 screened in a 96-well plate format as follows: 

To each well is added the following solutions @ 25°C in 
the following chronological order: 

1. 110 ]i\ of assay buffer. 
20 2. 10 [d. of 50 nM tritiated BzN-EJJ-CONH2 in assay 

buffer (IX) @ 25°C. 

3. 10 jxl. of testing compound in DMSO at 10 different 
concentrations in serial dilution (final DMSO, about 5% v/v) in 
duplicate @ 25 °C. 

25 4. 10 of 3.75 |ag/ml purified human recombinant 

GST-PTP1B in enzyme dilution buffer. 

5. The plate is shaken for 2 minutes. 

6. 10 (al. of 0.3 M-g/ml anti-glutathione S- transferase 
(anti-GST) rabbit IgG (Molecular Probes) diluted in antibody dilution 

30 buffer @ 25°C. 

7. The plate is shaken for 2 minutes. 

8. 50 of protein A-PVT SPA beads (Amersham) @ 

25°C. 

9. The plate is shaken for 5 minutes. The binding 
35 signal is quantified on a Microbeta 96-well plate counter. 

10. The non-specific signal is defined as the enzyme- 
ligand binding in the absence of anti-GST antibody. 
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11. 100% binding activity is defined as the enzyme- 
ligand binding in the presence of anti-GST antibody, but in the 
absence of the testing ligands with the non-specific binding 
subtracted. 

5 12. Percentage of inhibition is calculated accordingly. 

13. IC50 value is approximated from the non-linear 

regression fit with the 4-parameter/multiple sites equation (described 

in: "Robust Statistics", New York, Wiley, by P.J. Huber (1981) and 

reported in nM units. 
10 14. Test ligands (compounds) with larger than 90% 

inhibition at 10 |iM are defined as actives. 

The following Table I illustrates typical assay results of 
examples of known compounds which competitively inhibit the 
.15 binding of the binding agent, BzN-EJJ-CONH2. 
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Preparation of Cathepsin K(Q2) Mutant (CAT-K Mutant) 

Cathepsin K is a prominent cysteine protease in human 

osteoclasts and is believed to play a key role in osteoclast-mediated 

bone resorption. Inhibitors of cathepsin K will be useful for the 

5 treatment of bone disorders (such as osteoporosis) where excessive 

bone resorption occurs. Cathepsin K is synthesized as a dormant 

1 15 

preproenzyme (Seq. ID No. 4). Both the pre-domain (Met -Ala ) and 

the prodomain (Leu^-Arg^^ 7 ) must be removed for full catalytic 

115 329 

activity. The mature form of the protease (Ala -Met ) contains 

139 

10 the active site Cys residue (Cys ). 

The mature form of cathepsin K is engineered for 

expression in bacteria and other recombinant systems as a Met 
115 329 

Ala -Met construct by PCR-directed template modification of a 
clone that is identified. Epitope-tagged variants are also generated: 

15 (Met[FLAG]Ala 115 -Met 329 and Met Ala 115 -Met 329 [FLAG]; where 
FLAG is the octa-peptide AspTyrLysAspAspAspAspLys). For the 
purpose of establishing a binding assay, several other constructs are 
generated including Met[FLAG]Ala 115 -[Cys 139 to Ser 139 ]-Met 329 and 
Met Ala 115 -[Cys 139 to Ser 139 ]-Met 329 [FLAG] (where the active site 

20 Cys is mutated to a Ser residue), and Met[FLAG]Ala 115 -[Cys 139 to 
Ala 139 ]-Met 329 and Met Ala 115 -[Cys 139 to Ala 139 ]-Met 329 [FLAG] 
(where the active site Cys is mutated to an Ala residue). In all cases, 
the resulting re-engineered polypeptides can be used in a binding 
assay by tethering the mutated enzymes to SPA beads via specific 

25 anti-FLAG antibodies that are commercially available (IDI-KODAK). 
Other epitope tags, GST and other fusions can also be used for this 
purpose and binding assay formats other than SPA can also be used. 
Ligands based on the prefered substrate for cathepsin K (e.g. Ac-P2~ 
Pl, Ac-P2-Pl-aldehydes, Ac-P2-Pl-ketones; where PI is an amino 

30 acid with a hydrophilic side chain, preferably Arg or Lys, and P2 is 
an amino acid with a small hydrophobic side chain, preferably Leu, 
Val or Phe) are suitable in their radiolabeled (tritiated) forms for 
SPA-based binding assays. Similar binding assays can also be 
established for other cathepsin family members. 

35 
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Preparation of Apppain (capsase-3) Mutant 

Apopain is the active form of a cysteine protease 

belonging to the capsase superfamily of ICE/CED-3 like enzymes. It 

is derived from a catalytically dormant proenzyme that contains both 

5 the 17 kDa large subunit (pl7) and 12 kDa (pl2) small subunit of the 

catalytically active enzyme within a 32 kDa proenzyme polypeptide 

(p32). Apopain is a key mediator in the effector mechanism of 

apoptotic cell death and modulators of the activity of this enzyme, or 

structurally-related isoforms, will be useful for the therapeutic 

10 treatment of diseases where inappropriate apoptosis is prominent, 

e.g., Alzheimer's disease. 

The method used for production of apopain involves 

folding of active enzyme from its constituent pl7 and pl2 subunits 

which are expressed separately in E. coli. The apopain pi 7 subunit 

15 (Ser 29 -Asp 175 ) and pl2 subunit (Ser 176 -His 277 ) are engineered for 

nq -1 f7c -| nci 277 

expression as MetSer -Asp and MetSer -His constructs, 

respectively, by PCR-directed template modification. For the purpose 

of establishing a binding assay, several other constructs are 

generated, including a MetSer 29 -[Cys 163 to Ser 163 ]-Asp 175 large 

20 subunit and a Met 1 -[Cys 163 to Ser 163 ]-His 277 proenzyme. In the 

former case, the active site Cys residue in the large subunit (pl7) is 

replaced with a Ser residue by site-directed mutagenesis. This large 

subunit is then re-folded with the recombinant pl2 subunit to 

generate the mature form of the enzyme except with the active site 

163 163 

25 Cys mutated to a Ser. In the latter case, the same Cys to Ser 

mutation is made, except that the entire proenzyme is expressed. In 
both cases, the resulting re-engineered polypeptides can be used in a 
binding assay by tethering the mutated enzymes to SPA beads via 
specific antibodies that are generated to recognize apopain (antibodies 

30 against the prodomain, the large pl7 subunit, the small pl2 subunit 
and the entire pl7:pl2 active enzyme have been generated). Epitope 
tags or GST and other fusions could also be used for this purpose and 
binding assay formats other than SPA can also be used. 
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Ligands based on the prefered substrate for apopain (varients of 
AspGluValAsp), such as Ac- AspGluValAsp, Ac-AspGluValAsp- 
aldehydes, Ac-AspGluValAsp-ketones are suitable in their 
radiolabeled forms for SPA-based binding assays. Similar binding 
5 assays can also be established for other capsase family members. 

DESCRIPTION OF THE SEQUENCE LISTINGS 

SEQ ID NO. 1 is the top sense DNA strand of Figures 2A and 2B 
10 for the PTP1B tyrosine phosphatase enzyme. 

SEQ ID NO. 2 is the amino acid sequence of Figures 2 A and 2B for 
the PTP1B tyrosine phosphatase enzyme. 

15 SEQ ID NO. 3 is the top sense cDNA strand of Figures 3A, 3B and 
3C for the Cathepsin K preproenzyme. 

SEQ ID NO. 4 is the amino acid sequence of Figures 3 A, 3B and 3C 
for the Cathepsin K preproenzyme. 

20 

SEQ ID NO. 5 is the top sense cDNA strand of Figures 4A and 4B 
for the CPP32 apopain proenzyme. 

SEQ ID NO. 6 is the amino acid sequence of Figures 4A and 4B 
25 for the CPP32 apopain proenzyme. 

SEQ ID NO. 7 is the cDNA sequence of the human PTP- IB 1-320 
Ser mutant. 

30 SEQ ID NO. 8 is the amino acid sequence of the human 
PTP-IB1-320 Ser mutant. 

SEQ ID NO. 9 is the cDNA sequence for apopain C163S mutant. 

35 SEQ ID NO. 10 is the amino acid sequence for the apopain C163S 

mutant. 
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SEQ ID NO. 11 is the large subunit of the heterodimeric amino acid 
sequence for the apopain C163S mutant. 

SEQ ID NO. 12 is the cDNA sequence for the Cathepsin K C139S 
5 mutant. 

SEQ ID NO. 13 is the cDNA sequence for the Cathepsin K C139A 
mutant. 

10 SEQ ID NO. 14 is the amino acid sequence for the Cathepsin K 
C139S mutant. 

SEQ ID NO. 15 is the amino acid sequence for the Cathepsin K 
C139A mutant. 

15 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: Desmarais, Sylvie 
Friesen, Richard 
Zamboni, Richard 

(ii) TITLE OF INVENTION: NEW LIGANDS FOR PHOSPHATASE BINDING ASSAY 



(iii) NUMBER OF SEQUENCES: 15 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: ROBERT J. NORTH - MERCK & CO . , INC. 

(B) STREET: 126 EAST LINCOLN AVENUE - P.O. BOX 2000 

(C) CITY: RAHWAY 

(D) STATE: NEW JERSEY 

( E ) COUNTRY : USA 

(F) ZIP: 07065 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC -DOS /MS -DOS 

(D) SOFTWARE: FastSEQ for Windows Version 2.0 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: US unknown 

(B) FILING DATE: 04 -NOV- 19 9 6 

( C ) CLASSIFICATION : 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: North, Robert J. 

(B) REGISTRATION NUMBER: 27,3 66 

(C) REFERENCE /DOCKET NUMBER: 19 840 PCT 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 732-594-7262 

(B) TELEFAX: 732-594-4720 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 963 base pairs 

(B) TYPE : nucleic acid 

( C ) STRANDEDNESS : s ing le 

(D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



ATGGAGATGG 


AAAAGGAGTT 


CGAGCAGATC 


GACAAGTCCG 


GGAGCTGGGC 


GGCCATTTAC 


60 


CAGGATATCC 


GACATGAAGC 


CAGTGACTTC 


C C ATGT AG AG 


TGGCCAAGCT 


TC C T AAG AAC 


120 


AAAAACCGAA 


AT AGGT AC AG 


AGACGTCAGT 


CCCTTTGACC 


ATAGTCGGAT 


TAAACTACAT 


180 


CAAGAAGATA 


ATGACTATAT 


CAACGCTAGT 


TTGATAAAAA 


TGGAAGAAGC 


CCAAAGGAGT 


240 


TACATTCTTA 


CCCAGGGCCC 


TTTGCCTAAC 


ACATGCGGTC 


ACTTTTGGGA 


GATGGTGTGG 


300 


GAGCAGAAAA 


GCAGGGGTGT 


CGTCATGCTC 


AACAGAGTGA 


TGGAGAAAGG 


TTCGTTAAAA 


360 


TGCGCACAAT 


ACTGGCCACA 


AAAAGAAGAA 


AAAGAGATGA 


TCTTTGAAGA 


C AC AAATTTG 


420 


AAATT AAC AT 


TGATCTCTGA 


AG AT ATC AAG 


TCATATTATA 


CAGTGCGACA 


GC TAG AATTG 


480 


GAAAACCTTA 


CAACCCAAGA 


AACTCGAGAG 


ATCTTACATT 


TC C AC TAT AC 


CACATGGCCT 


540 


G AC TTTGG AG 


TCCCTGAATC 


ACCAGCCTCA 


TTCTTGAACT 


TTCTTTTCAA 


AGTCCGAGAG 


600 


TCAGGGTCAC 


TCAGCCCGGA 


GCACGGGCCC 


GTTGTGGTGC 


ACTGCAGTGC 


AGGCATCGGC 


660 


AGGTCTGGAA 


CCTTCTGTCT 


GGCTGATACC 


TGCCTCCTGC 


TGATGGACAA 


GAGGAAAGAC 


720 


CCTTCTTCCG 


TTGATATCAA 


GAAAGTGCTG 


TTAGAAATGA 


GGAAGTTTCG 


GATGGGGTTG 


780 


ATCCAGACAG 


CCGACCAGCT 


GCGCTTCTCC 


TACCTGGCTG 


TGATCGAAGG 


TGCCAAATTC 


840 


ATC ATGGGGG 


ACTCTTCCGT 


GCAGGATCAG 


TGGAAGGAGC 


TTTCCCACGA 


GGACCTGGAG 


900 


CCCCCACCCG 


AG C AT ATC C C 


CCCACCTCCC 


CGGCCACCCA 


AACGAATCCT 


GGAGCCACAC 


960 



TG A 9 6 3 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 320 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



Met 


Glu 


Met 


Glu 


Lys 


Glu 


Phe 


Glu 


Gln 


He 


Asp 


Lys 


Ser 


Gly 


Ser 


Trp 


1 








5 










10 










15 




Ala 


Ala 


He 


Tyr 
20 


Gin 


Asp 


He 


Arg 


His 
25 


Glu 


Ala 


Ser 


Asp 


Phe 
30 


Pro 


Cys 


Arg 


Val 


Ala 
35 


Lys 


Leu 


Pro 


Lys 


Asn 
40 


Lys 


Asn 


Arg 


Asn 


Arg 
45 


Tyr 


Arg 


Asp 


Val 


Ser 
50 


Pro 


Phe 


Asp 


His 


Ser 
55 


Arg 


He 


Lys 


Leu 


His 
60 


Gin 


Glu 


Asp 


Asn 


Asp 


Tyr 


He 


Asn 


Ala 


Ser 


Leu 


He 


Lys 


Met 


Glu 


Glu 


Ala 


Gin 


Arg 


Ser 


65 










70 










75 










80. 


Tyr 


lie 


Leu 


Thr 


Gin 
85 


Gly 


Pro 


Leu 


Pro 


Asn 
90 


Thr 


Cys 


Gly 


His 


Phe 
95 


Trp 


Glu 


Met 


Val 


Trp 
100 


Glu 


Gin 


Lys 


Ser 


Arg 
105 


Gly 


Val 


Val 


Met 


Leu 
110 


Asn 


Arg 
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Val 


Met 


Glu 


Lys 


Gly 


Ser 


Leu 


Lys 


Cys 


Ala 


Gin 


Tyr 


Trp 


Pro 


Gin 


Lys 






115 










120 










125 








Glu 


Glu 


Lys 


Glu 


Met 


He 


Phe 


Glu 


Asp 


Thr 


Asn 


Leu 


Lys 


Leu 


Thr 


Leu 




130 










135 










140 










He 


Ser 


Glu 


Asp 


He 


Lys 


Ser 


Tyr 


Tyr 


Thr 


Val 


Arg 


Gin 


Leu 


Glu 


Leu 


145 










150 










155 










160 


Glu 


Asn 


Leu 


Thr 


Thr 


Gin 


Glu 


Thr 


Arg 


Glu 


He 


Leu 


His 


Phe 


His 


Tyr 










165 










170 










175 




Thr 


Thr 


Trp 


Pro 


Asp 


Phe 


Gly 


Val 


Pro 


Glu 


Ser 


Pro 


Ala 


Ser 


Phe 


Leu 








180 










185 










190 






Asn 


Phe 


Leu 


Phe 


Lys 


Val 


Arg 


Glu 


Ser 


Gly 


Ser 


Leu 


Ser 


Pro 


Glu 


His 






195 










200 










205 








Gly 


Pro 


Val 


Val 


Val 


His 


Cys 


Ser 


Ala 


Gly 


He 


Gly 


Arg 


Ser 


Gly 


Thr 




210 










215 










220 










Phe 


Cys 


Leu 


Ala 


Asp 


Thr 


Cys 


Leu 


Leu 


Leu 


Met 


Asp 


Lys 


Arg 


Lys 


Asp 


225 










230 










235 










240 


Pro 


Ser 


Ser 


Val 


Asp 


He 


Lys 


Lys 


Val 


Leu 


Leu 


Glu 


Met 


Arg 


Lys 


Phe 










245 










250 










255 




Arg 


Met 


Gly 


Leu 


He 


Gin 


Thr 


Ala 


Asp 


Gin 


Leu 


Arg 


Phe 


Ser 


Tyr 


Leu 








260 










265 










270 






Ala 


Val 


He 


Glu 


Gly 


Ala 


Lys 


Phe 


He 


Met 


Gly 


Asp 


Ser 


Ser 


Val 


Gin 






275 










280 










285 








Asp 


Gin 


Trp 


Lys 


Glu 


Leu 


Ser 


His 


Glu 


Asp 


Leu 


Glu 


Pro 


Pro 


Pro 


Glu 




290 










295 










300 










His 


He 


Pro 


Pro 


Pro 


Pro 


Arg 


Pro 


Pro 


Lys 


Arg 


He 


Leu 


Glu 


Pro 


His 


305 










310 










315 










320 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1669 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



GAAACAAGCA 


CTGGATTCCA 


TATCCCACTG 


CCAAAACCGC 


ATGGTTCAGA 


TTATCGCTAT 


60 


TGCAGCTTTC 


ATC ATAAT AC 


ACACCTTTGC 


TGCCGAAACG 


AAGCCAGACA 


ACAGATTTCC 


120 


ATCAGCAGGA 


TGTGGGGGCT 


CAAGGTTCTG 


CTGCTACCTG 


TGGTGAGCTT 


TGCTCTGTAC 


180 


CCTGAGGAGA 


TACTGGACAC 


CCACTGGGAG 


CTATGGAAGA 


AG AC C C AC AG 


GAAGCAATAT 


240 


AACAACAAGG 


TGGATGAAAT 


CTCTCGGCGT 


TTAATTTGGG 


AAAAAAACCT 


GAAGTATATT 


300 


TCCATCCATA 


ACCTTGAGGC 


TTCTCTTGGT 


GTCCATACAT 


ATGAACTGGC 


TATGAACCAC 


360 


CTGGGGGACA 


TGACCAGTGA 


AGAGGTGGTT 


CAGAAGATGA 


CTGGACTCAA 


AGTACCCCTG 


420 


TCTCATTCCC 


GCAGTAATGA 


CACCCTTTAT 


ATCCCAGAAT 


GGGAAGGTAG 


AGCCCCAGAC 


430 


TCTGTCGACT 


ATCGAAAGAA 


AGGATATGTT 


ACTCCTGTCA 


AAAATCAGGG 


TCAGTGTGGT 


540 


TCCTGTTGGG 


CTTTTAGCTC 


TGTGGGTGCC 


CTGGAGGGCC 


AACTCAAGAA 


GAAAACTGGC 


600 


AAACTCTTAA 


ATCTGAGTCC 


CCAGAACCTA 


GTGGATTGTG 


TGTCTGAGAA 


TGATGGCTGT 


660 


GGAGGGGGCT 


ACATGACCAA 


TGCCTTCCAA 


TATGTGCAGA 


AG AAC CGGGG 


TATTGACTCT 


720 
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GAAGATGCCT 


ACCCATATGT 


GGGACAGGAA 


GAGAGTTGTA 


TGTACAACCC 


AACAGGCAAG 


780 


GCAGCTAAAT 


GC AG AGGGT A 


CAGAGAGATC 


CCCGAGGGGA 


ATGAGAAAGC 


CCTGAAGAGG 


840 


GCAGTGGCCC 


G AGTGGG AC C 


TGTCTCTGTG 


GCCATTGATG 


CAAGCCTGAC 


CTCCTTCCAG 


900 


TTTTACAGCA 


AAGGTGTGTA 


TTATGATGAA 


AGCTGCAATA 


GCGATAATCT 


GAACCATGCG 


960 


GTTTTGGC AG 


TGGGATATGG 


AATCCAGAAG 


GGAAACAAGC 


ACTGGATAAT 


TAAAAACAGC 


1020 


TGGGGAGAAA 


ACTGGGGAAA 


CAAAGGATAT 


ATCCTCATGG 


CTCGAAATAA 


GAACAACGCC 


1080 


TGTGGCATTG 


CCAACCTGGC 


CAGCTTCCCC 


AAGATGTGAC 


TCCAGCCAGC 


CAAATCCATC 


1140 


CTGCTCTTCC 


ATTTCTTCCA 


CGATGGTGCA 


GTGTAACGAT 


GCACTTTGGA 


AGGGAGTTGG 


1200 


TGTGCTATTT 


TTGAAGCAGA 


TGTGGTGATA 


CTGAGATTGT 


CTGTTCAGTT 


TCCCCATTTG 


1260 


111b IbL I 1L 


A 7\ 7v rp/i 7i rpp/-rn 
AAAlbAlLL 1 


/^>rp 7v si t ru i >f*TV** 
1 LL I A(_ 1 1 L\j 


L 1 1L 1L 1 


LLLAIvjAUC 1 


1111 lb 1 


1 "3 0 0 


GGCCATCAGG 


ACTTTCCCTG 


ACAGCTGTGT 


ACTCTTAGGC 


TAAGAGATGT 


GACTACAGCC 


1380 


TGCCCCTGAC 


TGTGTTGTCC 


CAGGGCTGAT 


GCTGTACAGG 


TACAGGCTGG 


AGATTTTCAC 


1440 


ATAGGTTAGA 


TTCTC ATTC A 


CGGGACTAGT 


TAGCTTTAAG 


CACCCTAGAG 


GACTAGGGTA 


1500 


ATCTGACTTC 


TCACTTCCTA 


AGTTCCCTTC 


TATATCCTCA 


AGGTAGAAAT 


GTCTATGTTT 


1560 


TCTACTCCAA 


TTCATAAATC 


T ATTC AT AAG 


TCTTTGGTAC 


AAGTTT AC AT 


GATAAAAAGA 


1620 


AATGTGATTT 


GTCTTCCCTT 


CTTTGCACTT 


TTGAAATAAA 


GTATTTATC 




1669 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



Met 


Trp 


Gly 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 








5 










10 










15 




Tyr 


Pro 


Glu 


Glu 


He 


Leu 


Asp 


Thr 


His 


Trp 


Glu 


Leu 


Trp 


Lys 


Lys 


Thr 








20 










25 










30 






His 


Arg 


Lys 


Gin 


Tyr 


Asn 


Asn 


Lys 


Val 


Asp 


Glu 


He 


Ser 


Arg 


Arg 


Leu 






35 










40 










45 








He 


Trp 


Glu 


Lys 


Asn 


Leu 


Lys 


Tyr 


He 


Ser 


He 


His 


Asn 


Leu 


Glu 


Ala 




50 










55 










60 










Ser 


Leu 


Gly 


Val 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 










70 










75 










80 


Met 


Thr 


Ser 


Glu 


Glu 


Val 


Val 


Gin 


Lys 


Met 


Thr 


Gly 


Leu 


Lys 


Val 


Pro 










85 










90 










95 




Leu 


Ser 


His 


Ser 


Arg 


Ser 


Asn 


Asp 


Thr 


Leu 


Tyr 


He 


Pro 


Glu 


Trp 


Glu 








100 










105 










110 






Gly 


Arg 


Ala 


Pro 


Asp 


Ser 


Val 


Asp 


Tyr 


Arg 


Lys 


Lys 


Gly 


Tyr 


Val 


Thr 






115 










120 










125 








Pro 


Val 


Lys 


Asn 


Gin 


Gly 


Gin 


Cys 


Gly 


Ser 


Cys 


Trp 


Ala 


Phe 


Ser 


Ser 




130 










135 










140 










Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 










150 










155 










160 
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Asn Leu Ser Pro Gin Asn Leu Val Asp Cys Val Ser Glu Asn Asp Gly 

165 170 175 

Cys Gly Gly Gly Tyr Met Thr Asn Ala Phe Gin Tyr Val Gin Lys Asn 

180 185 190 

Arg Gly lie Asp Ser Glu Asp Ala Tyr Pro Tyr Val Gly Gin Glu Glu 

195 200 205 

Ser Cys Met Tyr Asn Pro Thr -Gly. Lys Ala Ala Lys Cys Arg Gly Tyr 

210 215 220 

Arg Glu lie Pro Glu Gly Asn Glu Lys Ala Leu Lys Arg Ala Val Ala 
225 230 235 240 

Arg Val Gly Pro Val Ser Val Ala lie Asp Ala Ser Leu Thr Ser Phe 

245 250 255 

Gin Phe Tyr Ser Lys Gly Val Tyr Tyr Asp Glu Ser Cys Asn Ser Asp 

260 265 270 

Asn Leu Asn His Ala Val Leu Ala Val Gly Tyr Gly lie Gin Lys Gly 

275 280 285 

Asn Lys His Trp lie lie Lys Asn Ser Trp Gly Glu Asn Trp Gly Asn 

290 295 300 

Lys Gly Tyr lie Leu Met Ala Arg Asn Lys Asn Asn Ala Cys Gly lie 
305 310 315 320 

Ala Asn Leu Ala Ser Phe Pro Lys Met 
325 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 10 01 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



CTGCAGGAAT 


TCGGCACGAG 


GGGTGCTATT 


GTGAGGCGGT 


TGTAGAAGTT 


AATAAAGGTA 


60 


TCCATGGAGA 


ACACTGAAAA 


C TC AGTGG AT 


TCAAAATCCA 


TTAAAAATTT 


GGAACCAAAG 


120 


ATCATACATG 


GAAGCGAATC 


AATGGACTCT 


GGAATATCCC 


TGGACAACAG 


TTATAAAATG 


180 


GATTATCCTG 


AGATGGGTTT 


ATGTATAATA 


ATTAATAATA 


AG AATTTTC A 


TAAGAGCACT 


240 


GGAATGACAT 


CTCGGTCTGG 


TACAGATGTC 


GATGCAGCAA 


ACCTCAGGGA 


AACATTCAGA 


300 


AACTTGAAAT 


ATGAAGTCAG 


GAATAAAAAT 


GATCTTACAC 


GTGAAGAAAT 


TGTGGAATTG 


360 


ATGCGTGATG 


TTTCTAAAGA 


AGATCACAGC 


AAAAGGAGCA 


GTTTTGTTTG 


TGTGCTTCTG 


420 


AGCCATGGTG 


AAGAAGGAAT 


AATTTTTGGA 


ACAAATGGAC 


CTGTTGACCT 


GAAAAAAATA 


480 


ACAAACTTTT 


TCAGAGGGGA 


TCGTTGTAGA 


AGTCTAACTG 


GAAAACCCAA 


ACTTTTCATT 


540 


ATTCAGGCCT 


GCCGTGGTAC 


AGAACTGGAC 


TGTGGCATTG 


AGACAGACAG 


TGGTGTTGAT 


600 


GATGACATGG 


CGTGTCATAA 


AATACCAGTG 


GAGGCCGACT 


TCTTGTATGC 


ATACTCCACA 


660 


GCACCTGGTT 


ATTATTCTTG 


GCGAAATTCA 


AAGGATGGCT 


CCTGGTTCAT 


CCAGTCGCTT 


720 


TGTGCCATGC 


TGAAACAGTA 


TGCCGACAAG 


CTTGAATTTA 


TGCACATTCT 


TACCCGGGTT 


780 


AACCGAAAGG 


TGGCAACAGA 


ATTTGAGTCC 


TTTTCCTTTG 


ACGCTACTTT 


TCATGCAAAG 


840 
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AAACAGATTC CATGTATTGT TTCCATGCTC ACAAAAGAAC TCTATTTTTA TCACTAAAGA 9 00 

AATGGTTGGT TGGTGGTTTT TTTTAGTTTG TATGCCAAGT GAGAAGATGG TATATTTGGT 9 60 

AC TGT ATTTC CCTCTCATTT TGACCTACTC TC ATGCTGC A G 10 01 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Met 


Glu 


Asn 


Thr 


Glu 


Asn 


Ser 


Val 


Asp 


Ser 


Lys 


Ser 


He 


Lys 


Asn 


Leu 


1 








5 










10 










15 




Glu 


Pro 


Lys 


He 


He 


His 


Gly 


Ser 


Glu 


Ser 


Met 


Asp 


Ser 


Gly 


He 


Ser 








20 










2 5 










3 0 






Leu. 


Asp 


As n 


Ser 


Tyr 


Ly s 


JYie t. 


Asp 


Tyr 


Pro 




Mo t- 


uiy 


Leu 


Cy s 


Tie 
11c 






35 










40 










45 








He 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Gly 


Met 


Thr 


Ser 


Arg 




50 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


He 










85 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp 


Val 


Ser 


Lys 


Glu 


Asp 


His 


Ser 


Lys 


Arg 


Ser 








100 










105 










110 






Ser 


Phe 


Val 


Cys 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Gly 


He 


He 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Asn 


Phe 


Phe 


Arg 




130 










135 










140 










Gly 


Asp 


Arg 


Cys 


Arg 


Set- 


Leu 


Thr 


Gly 


Lys 


Pro 


Lys 


Leu 


Phe 


He 


He 


145 










ISO 










155 










160 


Gin 


Ala 


Cys 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp 


Cys 


Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










170 










175 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cys 


His 


Lys 


He 


Pro 


Val 


Glu 


Ala 


Asp 








180 










185 










190 






Phe 


Leu 


Tyr 


Ala 


Tyr 


Ser 


Thr 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


He 


Gin 


Ser 


Leu 


Cys 


Ala 


Met 


Leu 


Lys 




210 










215 










220 










Gin 


Tyr 


Ala 


Asp 


Lys 


Leu 


Glu 


Phe 


Met 


His 


He 


Leu 


Thr 


Arg 


Val 


Asn 


225 










230 










235 










240 


Arg 


Lys 


Val 


Ala 


Thr 


Glu 


Phe 


Glu 


Sei- 


Phe 


Ser 


Phe 


Asp 


Ala 


Thr 


Phe 










245 










250 










255 




His 


Ala 


Lys 


Lys 


Gin 


He 


Pro 


Cys 


Ile 


Val 


Ser 


Met 


Leu 


Thr 


Lys 


Glu 








260 










265 










270 






Leu 


Tyr 


Phe 


Tyr 


His 

























275 
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(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 963 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 



ATGGAGATGG 


AAAAGGAGTT 


CGAGCAGATC 


GACAAGTCCG 


GGAGCTGGGC 


GGCCATTTAC 


60 


CAGGATATCC 


GACATGAAGC 


CAGTGACTTC 


C C ATGT AG AG 


TGGCCAAGCT 


TCCTAAGAAC 


120 


AAAAACCGAA 


ATAGGTACAG 


AGACGTCAGT 


CCCTTTGACC 


ATAGTCGGAT 


TAAACTACAT 


180 


CAAGAAGATA 


ATG AC TAT AT 


CAACGCTAGT 


TTGATAAAAA 


TGGAAGAAGC 


CCAAAGGAGT 


240 


TACATTCTTA 


CCCAGGGCCC 


TTTGCCTAAC 


ACATGCGGTC 


ACTTTTGGGA 


GATGGTGTGG 


300 


GAG C AG AAAA 


GCAGGGGTGT 


CGTCATGCTC 


AACAGAGTGA 


TGGAGAAAGG 


TTCGTTAAAA 


360 


TGCGCACAAT 


ACTGGCCACA 


AAAAGAAGAA 


AAAGAGATGA 


TCTTTGAAGA 


CACAAATTTG 


420 


AAATTAACAT 


TGATCTCTGA 


AGATATCAAG 


TCATATTATA 


CAGTGCGACA 


GCTAGAATTG 


480 


GAAAACCTTA 


CAACCCAAGA 


AACTCGAGAG 


ATCTTACATT 


TC C ACTAT AC 


CACATGGCCT 


540 


GACTTTGGAG 


TCCCTGAATC 


ACCAGCCTCA 


TTCTTGAACT 


TTCTTTTCAA 


AGTCCGAGAG 


600 


TC AGGGTC AC 


TCAGCCCGGA 


GCACGGGCCC 


GTTGTGGTGC 


ACAGCAGTGC 


AGGCATCGGC 


660 


AGGTCTGGAA 


CCTTCTGTCT 


GGCTGATACC 


TGCCTCCTGC 


TGATGGACAA 


GAGGAAAGAC 


720 


CCTTCTTCCG 


TTGATATCAA 


GAAAGTGCTG 


TTAGAAATGA 


GGAAGTTTCG 


GATGGGGTTG 


780 


ATCCAGACAG 


CCGACCAGCT 


GCGCTTCTCC 


TACCTGGCTG 


TGATCGAAGG 


TGCCAAATTC 


840 


ATCATGGGGG 


ACTCTTCCGT 


GCAGGATCAG 


TGGAAGGAGC 


TTTCCCACGA 


GGACCTGGAG 


900 


CCCCCACCCG 


AGCATATCCC 


CCCACCTCCC 


CGGCCACCCA 


AACGAATCCT 


GGAGCCACAC 


960 


TGA 












9 63 



(2) INFORMATION FOR SEQ ID NO : 8 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

Met Glu Met Glu Lys Glu Phe Glu Gin lie Asp Lys Ser Gly Ser Trp 

15 10 15 

Ala Ala lie Tyr Gin Asp lie Arg His Glu Ala Ser Asp Phe Pro Cys 

20 25 30 

Arg Val Ala Lys Leu Pro Lys Asn Lys Asn Arg Asn Arg Tyr Arg Asp 
35 40 45 



BNCDOGID. sWO 9020024 A 1 t v 
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Val 


Ser 


Pro 


Phe 


Asd 


His 


Ser 




He 


Lvs 


Leu 


His 


Gin 


Glu 


Asp 


Asn 




50 










55 










60 










Asp 


Tyr 


He 


Asn 


Ala 


Ser 


Leu 


He 


Lvs 


Met 


Glu 


Glu 


Ala 


Gin 


Arg 


Ser 


65 










70 










75 










80 


Tvr 


lie 


Leu 


Thr 


Gin 


Glv 


Pro 


Leu 


Pro 


Asn 


Thr 


Cvs 


Glv 


His 


Phe 


Trp 










85 










90 










95 




Glu 


Met 


Val 


Tro 


Glu 


Gin 


Lys 


Ser 


Arg 


Glv 


Val 


Val 


Met 


Leu 


Asn 


Arcj 








100 










105 










110 






Val 


Met 


Glu 


Lys 


Glv 


Ser 


Leu 


Lys 


Cvs 


Ala 


Gin 


Tvr 


Tro 


Pro 


Gin 


Lvs 






115 










12 0 










125 








Glu 


Glu 


Lys 


Glu 


Met 


lie 


Phe 


Glu 


Asp 


Thr 


Asn 


Leu 


Lys 


Leu 


Thr 


Leu 




13 0 










13 5 










140 










lie 


Ser 


Glu 


Asp 


He 


Lys 


Ser 


TV r 


Tvr 


Thr 


Val 


Arg 


Gin 


Leu 


Glu 


Leu 


145 










15 0 










155 










160 


Glu 


As n 


Leu 


Thr 


Thr 


Gin 


Glu 


Thr 


Arg 


Glu 


He 


Leu 


His 


Phe 


His 


Tvr 










i65 










17 0 










175 




Thr 


Thr 


Trn 


Pro 


Asp 


Phe 


Gly 


Va 1 


Pro 


Glu 


Ser 


Pro 


Ala 


Ser 


Phe 


Leu 








180 










185 










19 0 






Asn 


Phe 


Leu 


Phe 


Lys 


Val 


Arg 


Glu 


Ser 


Glv 


Ser 


Leu 


Ser 


Pro 


Glu 


His 






195 










20 0 










205 








Glv 


Pro 


Val 


Val 


Val 


His 


Ser 


Ser 


Ala 


Gly 


lie 


Glv 


Thr 


Cvs 


Glv 


Arg 




210 










215 










220 










Ser 


Gly 


Thr 


Phe 


Cy s 


Leu 


Ala 


Asp 


Thr 


Cy s 


Leu 


Leu 


Leu 


Met 


Asp 


Lys 


225 










23 0 










235 










240 


Arg 


Ly s 


AS p 




Cor 

OCX. 




val 


As p 


Tip 
lie 


Ly s 




Val 


Leu 


Leu 


Glu 


Met 










245 










250 










255 




Arg 


Lys 


Phe 


Arg 


Met 


Gly 


Leu 


He 


Gin 


Thr 


Ala 


Asp 


Gin 


Leu 


Arg 


Phe 








260 










265 










270 






Ser 


Tyr 


Leu 


Ala 


Val 


He 


Glu 


Gly 


Ala 


Lys 


Phe 


He 


Met 


Gly 


Asp 


Ser 






275 










280 










285 








Ser 


Val 


Gin 


Asp 


Gin 


Trp 


Lys 


Glu 


Leu 


Ser 


His 


Glu 


Asp 


Leu 


Glu 


Pro 




290 










295 










300 










Pro 


Pro 


Glu 


His 


He 


Pro 


Pro 


Pro 


Pro 


Arg 


Pro 


Pro 


Lys 


Arg 


He 


Leu 


305 










310 










315 










320 


Glu 


Pro 































(2) INFORMATION FOR SEQ ID NO : 9 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1001 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 9 ; 



CTGCAGGAAT 


TCGGCACGAG 


GGGTGCTATT 


GTGAGGCGGT 


TGTAGAAGTT 


AATAAAGGTA 


60 


TCCATGGAGA 


ACACTGAAAA 


CTCAGTGGAT 


TCAAAATCCA 


TTAAAAATTT 


GGAACCAAAG 


120 


ATCATACATG 


GAAGCGAATC 


AATGGACTCT 


GGAATATCCC 


TGGACAACAG 


TTATAAAATG 


180 


GATTATCCTG 


AGATGGGTTT 


ATGTATAATA 


ATTAATAATA 


AG AATTTTC A 


TAAGAGCACT 


240 


GGAATGACAT 


CTCGGTCTGG 


TACAGATGTC 


GATGCAGCAA 


ACCTCAGGGA 


AACATTCAGA 


300 


AACTTGAAAT 


ATGAAGTCAG 


GAATAAAAAT 


GATCTTACAC 


GTGAAGAAAT 


TGTGGAATTG 


360 
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ATGCGTG ATG 


TTTCTAAAGA 


AGATCACAGC 


AAAAGGAGGA 


GTTTTGTTTG 


TGTGCTTCTG 


420 


AGCCATGGTG 


AAGAAGGAAT 


AATTTTTGGA 


AC AAATGG AC 


CTGTTGACCT 


GAAAAAAATA 


480 


ACAAACTTTT 


TCAGAGGGGA 


TCGTTGTAGA 


AGTCTAACTG 


GAAAACCCAA 


ACTTTTCATT 


540 


ATTCAGGCCT 


CCCGTGGTAC 


AGAACTGGAC 


TGTGGCATTG 


AGACAGACAG 


TGGTGTTGAT 


600 




\ — vj* J. V_7 x v_ r\ x nn 






X • X X X r\ X vJv- 


ATAPTPP AC A 


660 


GGACCTGGTT 


ATTATTCTTG 


GCGAAATTCA 


AAGGATGGCT 


CCTGGTTCAT 


CCAGTCGCTT 


720 


TGTGCCATGC 


TGAAACAGTA 


TGCCGACAAG 


CTTGAATTTA 


TGCACATTCT 


TACCCGGGTT 


780 


AACCGAAAGG 


TGGCAACAGA 


ATTTGAGTCC 


TTTTC CTTTG 


ACGCTACTTT 


TCATGCAAAG 


840 


AAACAGATTC 


CATGTATTGT 


TTCCATGCTC 


ACAAAAGAAC 


TCTATTTTTA 


TCACTAAAGA 


900 


AATGGTTGGT 


TGGTGGTTTT 


TTTTAGTTTG 


TATGCCAAGT 


GAGAAGATGG 


T AT ATTTGG T 


960 


ACTGTATTTC 


CCTCTCATTT 


TGACCTACTC 


TCATGCTGCA 


G 




1001 



(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Met 


Glu 


Asn 


Thr 


Glu 


Asn 


Ser 


Val 


Asp 


Ser 


Lys 


Ser 


He 


Lys 


Asn 


Leu 


1 








5 










10 










15 




Glu 


Pro 


Lys 


He 


He 


His 


Gly 


Ser 


Glu 


Ser 


Met 


Asp 


Ser 


Gly 


He 


Ser 








20 










25 










30 






Leu 


Asp 


Asn 


Ser 


Tyr 


Lys 


Met 


Asp 


Tyr 


Pro 


Glu 


Met 


Gly 


Leu 


Cys 


He 






35 










40 










45 








He 


He 


Asn 


Asn 


Lys 


Asn 


Phe 


His 


Lys 


Ser 


Thr 


Gly 


Met 


Thr 


Ser 


Arg 




50 










55 










60 










Ser 


Gly 


Thr 


Asp 


Val 


Asp 


Ala 


Ala 


Asn 


Leu 


Arg 


Glu 


Thr 


Phe 


Arg 


Asn 


65 










70 










75 










80 


Leu 


Lys 


Tyr 


Glu 


Val 


Arg 


Asn 


Lys 


Asn 


Asp 


Leu 


Thr 


Arg 


Glu 


Glu 


lie 










85 










90 










95 




Val 


Glu 


Leu 


Met 


Arg 


Asp 


Val 


Ser 


Lys 


Glu 


Asp 


His 


Ser 


Lys 


Arg 


Ser 








100 










105 










110 






Ser 


Phe 


Val 


Cys 


Val 


Leu 


Leu 


Ser 


His 


Gly 


Glu 


Glu 


Gly 


He 


He 


Phe 






115 










120 










125 








Gly 


Thr 


Asn 


Gly 


Pro 


Val 


Asp 


Leu 


Lys 


Lys 


He 


Thr 


Ash 


Phe 


Phe 


Arg 




130 










135 










140 










Gly 


Asp 


Arg 


Cys 


Arg 


Ser 


Leu 


Thr 


Gly 


Lys 


Pro 


Lys 


Leu 


Phe 


He 


lie 


145 










150 










155 










160 


Gin 


Ala 


Ser 


Arg 


Gly 


Thr 


Glu 


Leu 


Asp 


Cys 


Gly 


He 


Glu 


Thr 


Asp 


Ser 










165 










17 0 










175 




Gly 


Val 


Asp 


Asp 


Asp 


Met 


Ala 


Cys 


His 


Lys 


He 


Pro 


Val 


Glu 


Ala 


Asp 








180 










185 










190 






Phe 


Leu 


Tyr 


Ala 


Tyr 


Ser 


Thr 


Ala 


Pro 


Gly 


Tyr 


Tyr 


Ser 


Trp 


Arg 


Asn 






195 










200 










205 








Ser 


Lys 


Asp 


Gly 


Ser 


Trp 


Phe 


He 


Gin 


Ser 


Leu 


Cys 


Ala 


Met 


Leu 


Lys 




210 










215 










220 











902002 4A1 I ^ 
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Gin Tyr Ala Asp Lys Leu Glu Phe Met His lie Leu Thr Arg Val Asn 
225 230 235 240 

Arg Lys Val Ala' Thr Glu Phe Glu Ser Phe Ser Phe Asp Ala Thr Phe 

245 250 255 

His Ala Lys Lys Gin He Pro Cys He Val Ser Met Leu Thr Lys Glu 

260 265 270 

Leu Tyr Phe Tyr His 
275 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

Met Glu Asn Thr Glu Asn Ser Val Asp Ser Lys Ser He Lys Asn Leu 

15 10 15 

Glu Pro Lys He He His Gly Ser Glu Ser Met Asp Ser Gly He Ser 

20 25 30 

Leu Asp Asn Ser Tyr Lys Met Asp Tyr Pro Glu Met Gly Leu Cys He 

35 40 45 

He He Asn Asn Lys Asn Phe His Lys Ser Thr Gly Met Thr Ser Arg 

50 55 60 

Ser Gly Thr Asp Val Asp Ala Ala Asn Leu Arg Glu Thr Phe Arg Asn 
65 70 75 80 

Leu Lys Tyr Glu Val Arg Asn Lys Asn Asp Leu Thr Arg Glu Glu He 

85 90 95 

Val Glu Leu Met Arg Asp Val Ser Lys Glu Asp His Ser Lys Arg Ser 

100 105 110 

Ser Phe Val Cys Val Leu Leu Ser His Gly Glu Glu Gly He He Phe 

115 120 125 

Gly Thr Asn Gly Pro Val Asp Leu Lys Lys He Thr Asn Phe Phe Arg 

130 135 140 

Gly Asp Arg Cys Arg Ser Leu Thr Gly Lys Pro Lys Leu Phe He He 
145 150 155 160 

Gin Ala Ser Arg Gly Thr Glu Leu Asp Cys Gly He Glu Thr Asp Ser 

165 170 175 

Gly Val Asp Asp Asp Met Ala Cys His Lys He Pro Val Glu Ala Asp 

180 185 190 

Phe Leu Tyr Ala Tyr Ser Thr Ala Pro Gly Tyr Tyr Ser Trp Arg Asn 

195 200 205 

Ser Lys Asp Gly Ser Trp Phe He Gin Ser Leu Cys Ala Met Leu Lys 

210 215 220 

Gin Tyr Ala Asp Lys Leu Glu Phe Met His He Leu Thr Arg Val Asn 
225 230 235 240 

Arg Lys Val Ala Thr Glu Phe Glu Ser Phe Ser Phe Asp Ala Thr Phe 

245 250 255 

His Ala Lys Lys Gin He Pro Cys He Val Ser Met Leu Thr Lys Glu 

260 265 270 

Leu Tyr Phe Tyr His 
275 



BNSDOCID: <WO 9820024A1_I_> 
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(2) INFORMATION FOR SEQ ID NO: 12; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO; 12: 



ATGTGGGGGC 


TCAAGGTTCT 


GCTGCTACCT 


GTGGTGAGCT 


TTGCTCTGTA 


CCCTGAGGAG 


60 


AT AC TGG AC A 


CCCACTGGGA 


GCTATGGAAG 


AAGACCCACA 


GGAAGCAATA 


TAACAACAAG 


120 


GTGGATGAAA 


TCTCTCGGCG 


TTTAATTTGG 


GAAAAAAACC 


TGAAGTATAT 


TTC CATC CAT 


180 


AACCTTGAGG 


CTTCTCTTGG 


TGTCCATACA 


TATGAACTGG 


CTATGAACCA 


CCTGGGGGAC 


240 


ATGACCAGTG 


AAGAGGTGGT 


TCAGAAGATG 


ACTGGACTCA 


AAGTACCCCT 


GTCTCATTCC 


300 


CGCAGTAATG 


ACACCCTTTA 


TATCCCAGAA 


TGGGAAGGTA 


GAGCCCCAGA 


CTCTGTCGAC 


360 


TATCGAAAGA 


AAGGATATGT 


TACTCCTGTC 


AAAAATCAGG 


GTCAGTGTGG 


TTCCTCTTGG 


420 


GCTTTTAGCT 


CTGTGGGTGC 


CCTGGAGGGC 


CAACTCAAGA 


AGAAAACTGG 


CAAACTCTTA 


480 


AATCTGAGTC 


CCCAGAACCT 


AGTGGATTGT 


GTGTCTGAGA 


ATGATGGCTG 


TGGAGGGGGC 


540 


TACATGACCA 


ATGCCTTCCA 


ATATGTGCAG 


AAGAACCGGG 


GTATTGACTC 


TGAAGATGCC 


600 


TACCCATATG 


TGGG AC AGG A 


AGAGAGTTGT 


ATGTACAACC 


CAACAGGCAA 


GGC AGCTAAA 


660 


TGCAGAGGGT 


ACAGAGAGAT 


CCCCGAGGGG 


AATGAGAAAG 


CCCTGAAGAG 


GGC AGTGGCC 


720 


CGAGTGGGAC 


CTGTCTCTGT 


GGCCATTGAT 


GCAAGCCTGA 


CCTCCTTCCA 


GTTTTACAGC 


780 


AAAGGTGTGT 


ATTATGATGA 


AAGCTGCAAT 


AGCGATAATC 


TGAACCATGC 


GGTTTTGGCA 


840 


GTGGGATATG 


GAATCCAGAA 


GGGAAACAAG 


CACTGGATAA 


TT AAAAAC AG 


CTGGGGAGAA 


900 


AACTGGGGAA 


ACAAAGGATA 


TATCCTCATG 


GCTCGAAATA 


AGAACAACGC 


C TGTGGC ATT 


960 


GCCAACCTGG 


CCAGCTTCCC 


CAAGATGTGA 








990 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 990 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

ATGTGGGGGC TCAAGGTTCT GCTGCTACCT GTGGTGAGCT TTGCTCTGTA CCCTGAGGAG 60 
AT AC TGG AC A CCCACTGGGA GCTATGGAAG AAGACCCACA GGAAGCAATA TAACAACAAG 12 0 

GTGGATGAAA TCTCTCGGCG TTTAATTTGG GAAAAAAACC TGAAGTATAT TTCCATCCAT 180 
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AACCTTGAGG 


CTTCTCTTGG 


TGTCCATACA 


TATGAACTGG 


CTATGAACCA 


C C TGGGGG AC 


240 


ATGACCAGTG 


AAGAGGTGGT 


TCAGAAGATG 


ACTGG ACTC A 


AAGTACCCCT 


GTCTCATTCC 


300 


CGCAGTAATG 


ACACCCTTTA 


TATCCCAGAA 


TGGGAAGGTA 


GAGCCCCAGA 


CTCTGTCGAC 


360 


TATCGAAAG A * 


AAGGATATGT 


TACTCCTGTC 


AAAAATC AGG 


GTCAGTGTGG 


TTCCGCTTGG 


420 


GCTTTTAGCT 


CTGTGGGTGC 


CCTGGAGGGC 


CAACTCAAGA 


AG AAAAC TGG 


CAAACTCTTA 


480 


AATCTGAGTC 


CCCAGAACCT 


AGTGGATTGT 


GTGTCTGAGA 


ATGATGGCTG 


TGGAGGGGGC 


540 


TACATGACCA 


ATGCCTTCCA 


ATATGTGCAG 


AAGAACCGGG 


GTATTGACTC 


TGAAGATGCC 


600 


rp 7\ ~K rn A *TV> 

1ALLLA I AICj 


IXjCjCjACAGGA 


AG AG AGT i G I 


A 1 G 1 AtAALL 


LAALAbbLAA 


bOLAbL 1 AAA 


0 

O D U 


TGCAGAGGGT 


ACAGAGAGAT 


CCCCGAGGGG 


AATGAGAAAG 


CCCTG AAGAG 


GGCAGTGGCC 


720 


CGAGTGGGAC 


CTGTCTCTGT 


GGCCATTGAT 


GCAAGCCTGA 


CCTCCTTCCA 


GTTTTAC AGC 


780 


AAAGGTGTGT 


ATTATGATGA 


AAGCTGCAAT 


AGCGATAATC 


TGAACCATGC 


GGTTTTGGCA 


840 


GTGGGATATG 


GAATCCAGAA 


GGGAAACAAG 


CACTGGATAA 


TTAAAAACAG 


CTGGGGAGAA 


900 


AACTGGGGAA 


ACAAAGGATA 


TATCCTCATG 


GCTCGAAATA 


AGAACAACGC 


CTGTGGCATT 


960 


GCCAACCTGG 


CCAGCTTCCC 


CAAGATGTGA 








990 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 



Met 


Trp 


Gly 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 








5 










10 










15 




Tyr 


Pro 


Glu 


Glu 


He 


Leu 


Asp 


Thr 


His 


Trp 


Glu 


Leu 


Trp 


Lys 


Lys 


Thr 








20 










25 










30 






His 


Arg 


Lys 


Gin 


Tyr 


Asn 


Asn 


Lys 


Val 


Asp 


Glu 


He 


Ser 


Arg 


Arg 


Leu 






35 










40 










45 








lie 


Trp 


Glu 


Lys 


Asn 


Leu 


Lys 


Tyr 


He 


Ser 


He 


His 


Asn 


Leu 


Glu 


Ala 




50 










55 










60 










Ser 


Leu 


Gly 


Val 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 










70 










75 










80 


Met 


Thr 


Ser 


Glu 


Glu 


Val 


Val 


Gin 


Lys 


Met 


Thr 


Gly 


Leu 


Lys 


Val 


Pro 










85 










90 










95 




Leu 


Ser 


His 


Ser 


Arg 


Ser 


Asn 


Asp 


Thr 


Leu 


Tyr 


He 


Pro 


Glu 


Trp 


Glu 








100 










105 










110 






Gly 


Arg 


Ala 


Pro 


Asp 


Ser 


Val 


Asp 


Tyr 


Arg 


Lys 


Lys 


Gly 


Tyr 


Val 


Thr 






115 










120 










125 








Pro 


Val 


Lys 


Asn 


Gin 


Gly 


Gin 


Cys 


Gly 


Ser 


Ser 


Trp 


Ala 


Phe 


Ser 


Ser 




130 










135 










140 










Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 










150 










155 










160 


Asn 


Leu 


Ser 


Pro 


Gin 


Asn 


Leu 


Val 


Asp 


Cys 


Val 


Ser 


Glu 


Asn 


Asp 


Gly 



165 170 175 
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Cys 


Gly 


Gly 


Gly 


Tyr 


Met 


Thr 


Asn 


Ala 


Phe 


Gin 


Tyr 


Val 


Gin 


Lys 


Asn 








180 










185 










190 






Arg 


Gly 


He 


Asp 


Ser 


Glu 


Asp 


Ala 


Tyr 


Pro 


Tyr 


Val 


Gly 


Gin 


Glu 


Glu 






195 










200 










205 








Ser 


Cys 


Met 


Tyr 


Asn 


Pro 


Thr 


Gly 


Lys 


Ala 


Ala 


Lys 


Cys 


Arg 


Gly 


Tyr 




210 










215 










220 










Arg 


Glu 


He 


Pro 


Glu 


Gly 


Asn 


Glu 


Lys 


Ala 


Leu 


Lys 


Arg 


Ala 


Val 


Ala 


225 










230 










235 










240 


Arg 


Val 


Gly 


Pro 


Val 


Ser 


Val 


Ala 


He 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Phe 










245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Gly 


Val 


Tyr 


Tyr 


Asp 


Glu 


Ser 


Cys 


Asn 


Ser 


Asp 








260 










265 










270 






Asn 


Leu 


Asn 


His 


Ala 


Val 


Leu 


Ala 


Val 


Gly 


Tyr 


Gly 


He 


Gin 


Lys 


Gly 






275 










280 










285 








Asn 


Lys 


His 


Trp 


lie 


He 


Lys 


Asn 


Ser 


Trp 


Gly 


Glu 


Asn 


Trp 


Gly 


Asn 




290 










295 










300 










Lys 


Gly 


Tyr 


He 


Leu 


Met 


Ala 


Arg 


Asn 


Lys 


Asn 


Asn 


Ala 


Cys 


Gly 


He 


305 










310 










315 










320 


Ala 


Asn 


Leu 


Ala 


Ser 


Phe 


Pro 


Lys 


Met 

















325 



(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 29 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Met 


Trp 


Gly 


Leu 


Lys 


Val 


Leu 


Leu 


Leu 


Pro 


Val 


Val 


Ser 


Phe 


Ala 


Leu 


1 








5 










10 










15 




Tyr 


Pro 


Glu 


Glu 
20 


He 


Leu 


Asp 


Thr 


His 
25 


Trp 


Glu 


Leu 


Trp 


Lys 
30 


Lys 


Thr 


His 


Arg 


Lys 
35 


Gin 


Tyr 


Asn 


Asn 


Lys 
40 


Val 


Asp 


Glu 


He 


Ser 
45 


Arg 


Arg 


Leu 


He 


Trp 
50 


Glu 


Lys 


Asn 


Leu 


Lys 
55 


Tyr 


lie 


Ser 


He 


His 
60 


Asn 


Leu 


Glu 


Ala 


Ser 


Leu 


Gly 


Val 


His 


Thr 


Tyr 


Glu 


Leu 


Ala 


Met 


Asn 


His 


Leu 


Gly 


Asp 


65 










70 










75 










80 


Met 


Thr 


Ser 


Glu 


Glu 
85 


Val 


Val 


Gin 


Lys 


Met 
90 


Thr 


Gly 


Leu 


Lys 


Val 
95 


Pro 


Leu 


Ser 


His 


Ser 
100 


Arg 


Ser 


Asn 


Asp 


Thr 
105 


Leu 


Tyr 


He 


Pro 


Glu 
110 


Trp 


Glu 


Gly 


Arg 


Ala 
115 


Pro 


Asp 


Ser 


Val 


Asp 
120 


Tyr 


Arg 


Lys 


Lys 


Gly 
125 


Tyr 


Val 


Thr 


Pro 


Val 
130 


Lys 


Asn 


Gin 


Gly 


Gin 
135 


Cys 


Gly 


Ser 


Ala 


Trp 
140 


Ala 


Phe 


Ser 


Ser 


Val 


Gly 


Ala 


Leu 


Glu 


Gly 


Gin 


Leu 


Lys 


Lys 


Lys 


Thr 


Gly 


Lys 


Leu 


Leu 


145 










150 










155 










160 


Asn 


Leu 


Ser 


Pro 


Gin 
165 


Asn 


Leu 


Val 


Asp 


Cys 
170 


Val 


Ser 


Glu 


Asn 


Asp 
175 


Gly 


Cys 


Gly 


Gly 


Gly 
180 


Tyr 


Met 


Thr 


Asn 


Ala 
185 


Phe 


Gin 


Tyr 


Val 


Gin 
190 


Lys 


Asn 


Arg 


Gly 


He 
195 


Asp 


Ser 


Glu 


Asp 


Ala 
200 


Tyr 


Pro 


Tyr 


Val 


Gly 
205 


Gin 


Glu 


Glu 
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Ser 


Cys 


Met 


Tyr 


Asn 


Pro 


Thr 


Gly 


Lys 


Ala 


Ala 


Lys 


Cys 


Arg 


Gly 


Tyr 




210 










215 










220 










Arg 


Glu 


lie 


Pro 


Glu 


Gly 


Asn 


Glu 


Lys 


Ala 


Leu 


Lys 


Arg 


Ala 


Val 


Ala 


225. 










230 










235 










240 


Arg 


Val 


Gly 


Pro 


Val 


Ser 


Val 


Ala 


He 


Asp 


Ala 


Ser 


Leu 


Thr 


Ser 


Phe 










245 










250 










255 




Gin 


Phe 


Tyr 


Ser 


Lys 


Gly 


Val 


Tyr 


Tyr 


Asp 


Glu 


Ser 


Cys 


Asn 


Ser 


Asp 








260 










265 










270 






Asn 


Leu 


Asn 


His 


Ala 


Val 


Leu 


Ala 


Val 


Gly 


Tyr 


Gly 


He 


Gin 


Lys 


Gly 






275 










280 










285 








Asn 


Lys 


His 


Trp 


He 


He 


Lys 


Asn 


Ser 


Trp 


Gly 


Glu 


Asn 


Trp 


Gly 


Asn 




290 










295 










300 










Lys 


Gly 


Tyr 


He 


Leu 


Met 


Ala 


Arg 


Asn 


Lys 


Asn 


Asn 


Ala 


Cys 


Gly 


He 


305 










310 










315 










320 


Ala 


Asn 


Leu 


Ala 


Ser 


Phe 


Pro 


Lys 


Met 
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WHAT IS CLAIMED: 

1. A peptide comprising a ligand having binding affinity for 
a tyrosine phosphatase or cysteine protease, wherein said ligand contains two 
5 or more 4-phosphono(difluoromethyl) phenylalanine groups. 



2. The peptide of Claim 1 wherein said ligand has a greater 
binding affinity than the corresponding ligand only containing one of said 4- 
phosphono(difluoromethyl) phenylalanine groups. 

10 

3. A peptide selected from the group consisting of: 
N-BenzoyI-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenyIalanineamide (BzN-EJJ-CONH2), where 

E is glutamic acid and J is 4-phosphono(difluoro-methyl)]-L-phenylalanyl; 
15 N-Benzoyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyI-[4- 
phosphono(difluoromethyl)l-L-phenylalanine amide; 
N-Acetyl-L-glutamyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4- 
phosphono(difluoromethyl)]-L-phenylalanine amide; 

L-Glutamyl-[4-phosphono(difIuoromethyl)l-L-phenylalanyl-[4-phosphono— 
20 (difluoromethyl)|-L-phenylalanine amide; 

L-Lysinyl-[4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; 

L-Serinyl-14-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)J-L-phenylalanine amide; 
25 L-Prolinyl-l4-phosphono(difluoromethyl)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide; and 

L-Isoleucinyl-[4-phosphono(difluoromethyI)]-L-phenylalanyl-[4-phosphono- 
(difluoromethyl)]-L-phenylalanine amide. 

30 4. The peptide of Claim 3 in tritiated or 1 125 iodinated form. 

5. A tritiated peptide, N-(3,5-Ditritio)benzoyl-L-glutamyl-[4- 
phosphono(difluoromethyl)l-L-phenylalanyl-r4-phosphono(difluoromethyl)|- 
L-phenylalanineamide. 

35 
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6. A process for increasing the binding affinity of a ligand 
for a tyrosine phosphatase or cysteine protease comprising introducing into 
the ligand two or more 4-phosphono(difluoromethyl) phenylalanine groups. 

5 
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ATGGAGATGGAAAAGGAGTTCGAGCAGATCGACAAGTCCGGGAGCTGGGCGGCCATTTAC 
I + + + + + + 60 

TACCTCTACCTTTTCCTCAAGCTCGTCTAGCTGTTCAGGCCCTCGACCCGCCGGTAAATG 
1 MetGl uMetGl uLysGl uPheGl uGl n 1 1 eAspLysSerGlySerTrpAl aAl a II eTyr 20 

CAGGATATCCGACATGAAGCCAGTGACTTCCCATGTAGAGTGGCCAAGCTTCCTAAGAAC 
61 + + + + + + 120 

GTCCTATAGGCTGTACTTCGGTCACTGAAGGGTACATCTCACCGGTTCGAAGGATTCTTG 
21 Gl nAspIl eArgHi sGl uAl aSerAspPheProCysArgVal Al aLysLeuProLysAsn 40 

AAAAACCGAAATAGGTACAGAGACGTCAGTCCCTTTGACCATAGTCGGATTAAACTACAT 

121 + + + + - + + 180 

TTTTTGGCTTTATCCATGTCTCTGCAGTCAGGGAAACTGGTATCAGCCTAATTTGATGTA 
41 Ly sAsnArgAsnArgTy rArgAspVal SerProPheAspHi sSerArg 1 1 eLysLeuHi s 60 

CAAGAAGATAATGACTATATCAACGCTAGTTTGATAAAAATGGAAGAAGCCCAAAGGAGT 

181 + + --- -+ + + + 240 

GTTCTTCTATTACTGATATAGTTGCGATCAAACTATTTTTACCTTCTTCGGGTTTCCTCA 

61 Gl nGl uAspAsnAspTy r 1 1 eAsnAl aSerLeu 1 1 eLysMetGl uGl uAl aGl nArgSer 80 

TACATTCTTACCCAGGGCCCTTTGCCTAACACATGCGGTCACTTTTGGGAGATGGTGTGG 

+ + + + -+ + 300 

ATGTAAGAATGGGTCCCGGGAAACGGATTGTGTACGCCAGTGAAAACCCTCTACCACACC 
81 TyrlleLeuThrGlnGlyProLeuProAsnThrCysGlyHisPheTrpGluMetValTrp 100 

GAGCAGAAAAGCAGGGGTGTCGTCATGCTCAACAGAGTGATGGAGAAAGGTTCGTTAAAA 
3Q1 + + + + + + 360 

CTCGTCTTTTCGTCCCCACAGCAGTACGAGTTGTCTCACTACCTCTTTCCAAGCAATTTT 
101 Gl uGl nLysSerArgGlyVal ValMetLeuAsnArgValMetGl uLysGlySerLeuLys 120 

TGCGCACAATACTGGCCACAAAAAGAAGAAAAAGAGATGATCTTTGAAGACACAAATTTG 
361 + + + + + + 420 

ACGCGTGTTATGACCGGTGTTTTTCTTCTTTTTCTCTACTAGAAACTTCTGTGTTTAAAC 
121 CysAl aGl nTy rTrpProGl nLysGl uGl uLysGl uMetll ePheGl uAspThrAsnLeu 140 

AAATTAACATTGATCTCTGAAGATATCAAGTCATATTATACAGTGCGACAGCTAGAATTG 
42i + _ + + + + + 480 

TTTAATTGTAACTAGAGACTTCTATAGTTCAGTATAATATGTCACGCTGTCGATCTTAAC 
141 Ly sLeuThrLeuI 1 eSerGl uAspI 1 eLy sSerTyrTy rThrVa 1 ArgGI nLeuGl uLeu 160 

GAAAACCTTACAACCCAAGAAACTCGAGAGATCTTACATTTCCACTATACCACATGGCCT . 

481 + + + + + + 540 

CTTTTGGAATGTTGGGTTCTTTGAGCTCTCTAGAATGTAAAGGTGATATGGTGTACCGGA 

161 Gl uAsnLeuThrThrGl nGl uThrArgGl ul 1 eLeuHi sPheHi sTyrThrThrTrpPro 180 
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GACTTTGGAGTCCCTGAATCACCAGCCTCATTCTTGAACTTTCTTTTCAAAGTCCGAGAG 

541 + + + + + + 600 

CTGAAACCTCAGGGACTTAGTGGTCGGAGTAAGAACTTGAAAGAAAAGTTTCAGGCTCTC 

181 AspPheGlyVal ProGl uSerProAl aSerPheLeuAsnPheLeuPheLysVal ArgGI u 200 

TCAGGGTCACTCAGCCCGGAGCACGGGCCCGTTGTGGTGCACTGCAGTGCAGGCATCGGC 

fi ni + + + + + + 560 

AGTCCCAGTGAGTCGGGCCTCGTGCCCGGGCAACACCACGTGACGTCACGTCCGTAGCCG 

201 SerGl ySerLeuSerProGl uHi sGlyProVal Val Val Hi sCysSerAl aGly II eGly 220 
AGGTCTGGAACCTTCTGTCTGGCTGATACCTGCCTCCTGCTGATGGACAAGAGGAAAGAC 

rr-y + + + + + + 

TCCAGACCTTGGAAGACAGACCGACTATGGACGGAGGACGACTACCTGTTCTCCTTTCTG 
221 ArgSerGlyThrPheCysLeuAl aAspThrCysLeuLeuLeuMetAspLysArgLysAsp 240 

CCTTCTTCCGTTGATATCAAGAAAGTGCTGTTAGAAATGAGGAAGTTTCGGATGGGGTTG 

721 + + + + + + 780 

GGMGMGGCMCTATAGTTCTTTCACGACMTCTTTACTCCTTCMAGCCTACCCCAAC 

241 ProSerSerValAspIleLysLysValLeuLeuGluMetArgLysPheArgMetGlyLeu 2b0 

ATCCAGACAGCCGACCAGCTGCGCTTCTCCTACCTGGCTGTGATCGAAGGTGCCAAATTC 

7 oi + + + + + -- + a4U 

TAGGTCTGTCGGCTGGTCGACGCGAAGAGGATGGACCGACACTAGCTTCCACGGTTTAAG 

261 1 1 eGl nThrAl aAspGI nLeuArgPheSerTy rLeuAl a Val II eGl uGl yAl aLysPhe • 

ATCATGGGGGACTCTTCCGTGCAGGATCAGTGGAAGGAGCTTTCCCACGAGGACCTGGAG 

841 + + + + + + 900 

TAGTACCCCCTGAGAAGGCACGTCCTAGTCACCTTCCTCGAAAGGGTGCTCCTGGACCTC 

1 1 eMetGlyAspSerSerVal Gl nAspGl nTrpLysGl uLeuSerhh sGl uAspLeuG I u 

CCCCCACCCGAGCATATCCCCCCACCTCCCCGGCCACCCAAACGAATCCTGGAGCCACACTGA 

qni + - + + + + + -"- 960 

GGGGGTGGGCTCGTATAGGGGGGTGGAGGGGCCGGTGGGTTTGCTTAGGACCTCGGTGTGACT 

301 ProProProGl uHi s 1 1 eProProProProArgProProLysArgl 1 eLeuGl uProHi sEnd 32U 
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6AAACAAGCACTGGATTCCATATCCCACTGCCAAAACCGCATGGTTCAGATTATCGCTAT 

i + + + + + + 60 

CTTTGTTCGTGACCTAAGGTATAGGGTGACGGTTTTGGCGTACCAAGTCTAATAGCGATA 

TGCAGCTTTCATCATAATACACACCTTTGCTGCCGAAACGAAGCCAGACAACAGATTT JZ 

61 + + + + + + 120 

ACGTCGAAAGTAGTATTATGTGTGGAAACGACGGCTTTGCTTCGGTCTGTTGTCTAAAGG 

. ATCAGCAGGATGTGGGGGCTCAAGGTTCTGCTGCTACCTGTGGTGAGCTTTGCTCTGTAC 

12 i + + + + + + 

TAGTCGTCCTACACCCCCGAGTTCCAAGACGACGATGGACACCACTCGAAACGAGACATG 

MetTrpGl yLeuLysVal LeuLeuLeuProVal Val SerPheAl aLeulyr 

CCTGAGGAGATACTGGACACCCACTGGGAGCTATGGAAGAAGACCCACAGGAAGCAATAT 

181 + + + + + + 240 

GGACTCCTCTATGACCTGTGGGTGACCCTCGATACCTTCTTCTGGGTGTCCTTCGTTATA 

ProGl uGl ul 1 eLeuAspThrHi sTrpGl uLeuTrpLysLysThrHi sArgLysGl nTyr 

AACAACAAGGTGGATGAAATCTCTCGGCGTTTAATTTGGGAAAAAAACCTGAAGTATATT 

241 + + + + + --" + 300 

TTGTTGTTCCACCTACTTTAGAGAGCCGCAAATTAAACCCTTTTTTTGGACTTCATATAA 

AsnAsnLysValAspGluIleSerArgArgLeuIleTrpGluLysAsnLeuLysTyrlle 

TCCATCCATAACCTTGAGGCTTCTCTTGGTGTCCATACATATGAACTGGCTATGAACCAC 

30i - + + + + + + 3bU 

AGGTAGGTATTGGAACTCCGAAGAGAACCACAGGTATGTATACTTGACCGATACTTGGTG 

Ser 1 1 eHi sAsnLeuG'l uAl aSerLeuGl yVa 1 Hi sThrTy rGl uLeuAl aMetAsnHi s 

CTGGGGGACATGACCAGTGAAGAGGTGGTTCAGAAGATGACTGGACTCAAAGTACCCCTG 

361 + + + + + + 420 

GACCCCCTGTACTGGTCACTTCTCCACCAAGTCTTCTACTGACCTGAGTTTCATGGGGAC 

LeuGl yAspMetThrSerGI uGl uVal Val Gl nLysMet i hrGlyLeuLysVal ProLeu 

TCTCATTCCCGCAGTAATGACACCCTTTATATCCCAGAATGGGAAGGTAGAGCCCCAGAC 

421 + + + + + + 480 

AGAGTAAGGGCGTCATTACTGTGGGAAATATAGGGTCTTACCCTTCCATCTCGGGGTCTG 

SerHisSerArgSerAsnAspThrLeuTyrlleProGluTrpGluGlyArgAlaProAsp 

TCTGTCGACTATCGAAAGAAAGGATATGTTACTCCTGTCAAAAATCAGGGTCAGTGTGGT 

481 + + + + + + 540 

AGACAGCTGATAGCTTTCTTTCCTATACAATGAGGACAGTTTTTAGTCCCAGTCACACCA 

SerValAspTyrArgLysLysGlyTyrValThrProValLysAsnGlnGlyGlnCysGly 
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TCCTGTTGGGCTTTTAGCTCTGTGGGTGCCCTGGAGGGCCAACTCAAGAAGAAAACTGGC ^ 

541 AGGACAACCCGAAAATCGAGACAC^ 

SerCysTrpAl aPheSerSerValGlyAl aLeuGl uGlyGl nLeuLysLysLysThrGl y 

139 

AMCTCTTAMTCTGAGTCCCCAGMCCTAGTGGATTGTGTGTCTGAGAATGATGGCTG ^ 

601 TTTGAGAATTTAGACTCAGGGGTCTTGGATCACCTMCACACAG 

LysLeuL^uAsnLeuSerProGlnAsnLeuValAspCysValSerGluAsnAspGlyCys 

GGAGGGGGCTACATGACCMTGCCTTCCMTATGTGCAGAAGMCCGGGGTATTGACTCT _^ 
661 CCTCCCCCGATGTACTGGTTACGGAAGGTTATACACGTCTTCTTGGCCCCATAACT 

GAAGATGCCTACCCATATGTGGGACAGGAAGAGAGTTGTATGTACAACCCAACAGGCAAG ; ^ 
721 CTTCTACGGATGGGTATACACCCTGTCCTTCTCTCAACATACATGTTGGGTTCT 

GCAGCTAMTGCAGAGGGTACAGAGAGATCCCCGAGGGGMTGAGAAAGCCCTGAAGAGG ^ 

781 CGTCGATTTACGTCTCCCATGTCTCTCTAGGGGCTCCCCT 
AlaAlaLysCysA^ 

GCAGTGGCCCGAGTGGGACCTGTCTCTGTGGCCATTGATGCAAGCCTGACCTCCTTCCAG ^ 

841 CGTCACCGGGCTCACCCTGGACAGAGACACCGGTAACTACGTTCGGACra 

AlaValAlaArgValGlyProValSerValAlalleAspAlaSerLeuThrSerPheGln 

TTTTACAGCAAAGGTGTGTATTATGATGAAAGCTGCAATAGCGATAATCTGAACCATGCG ^ Q 

901 aaaatgtcgtttccacacaWactactttcgacgttatcgctattagacttgg^ 

PheTy rSerLysGl yVal TyrTyrAspGl uSerCysAsnSerAspAsnLeuAsnHi sAl a 

gttttggcagtgggatatggaatccagaagggaaacaagcactggataattaaaaacagc iq2o 
961 caaaaccgtcaccctataccttaggtctocctttc 

ValLeuAlaValGlyTyrGlylleGlnLysGlyAsnLysHisTrpIlelleLysAsnSer 

tggggagaaaactggggamcaaaggatatatcctcatggctcgaaataagaacaacgcc i 

1021 ACCCCTCTTTTGACCCCTTTGTTT 

TrpGlyGl uAsnTrpGl yAsnLy sGl yTy r 1 1 eLeuMetAl aArgAsnLy sAsnAsnAl a 
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TGTGGCATTGCCMCCTGGCCAGCTTCCCCMGATGTGACTCCAGCCAGCCAAATCCATC ^ 

1081 ACACCGTAACGGTTGGACCGGTCGAAGGGGTTCTACACTGAGGTCGGTCGGTTTAGGTAG 
CysGly II eAl aAsnLeuAl aSerPheProLysMetEnd 

CTGCTCTTCCATTTCTTCCACGATGGTGCAGTGTAACGATGCACTTTGGAAGGGAGTTGG 

+ _ + -f + + 1 <lUU 

1 GACGAGAAGGTAAAGAAGGTGCTACCACGTCACATTGCTACGTGAAACCTTCCCTCAACC 

TGTGCTATTTTTGAAGCAGATGTGGTGATACTGAGATTGTCTGTTCAGTTTCCCCATTTG ^ 

1201 ACACGATAMAACTTCGTCTACA^ 

TTTGTGCTTCAMTGATCCTTCCTACTTTGCTTCTCTCCACCCATGACCTTTTTCACTGT ^ 

1261 AAACACGAAGTTTACTAGGMGGATGAAACGAAGAGAGGTGGGTACTGGAAAAAGTGACA 

GGCCATCAGGACTTTCCCTGACAGCTGTGTACTCTTAGGCTAAGAGATGTGACTACAGCC 

n?1 ___ + + + + + + iJbU 

CCGGTAGTCCTGAAAGGGACTGTCGACACATGAGAATCCGATTCTCTACACTGATGTCGG 



TGCCCCT 



GACTGTGTTGTCCCAGGGCTGATGCTGTACAGGTACAGGCTGGAGATTTTCAC 



+ + 1440 



1 OO] + + + 

ACGGGGACTGACACAACAGGGTCCCGACTACGACATGTCCATGTCCGACCTCTAAAAGTG 

ATAGGTTAGATTCTCATTCACGGGACTAGTTAGCTTTAAGCACCCTAGAGGACTAGGGTA 

-.441 + + + + + + ibUU 

TATCCAATCTAAGAGTAAGTGCCCTGATCAATCGAAATTCGTGGGATCTCCTGATCCCAT 

ATCTGACTTCTCACTTCCTAAGTTCCCTTCTATATCCTCAAGGTAGAAATGTCTATGTTT 

icni + + + + + + ibbU 

TAGACTGAAGAGTGAAGGATTCAAGGGAAGATATAGGAGTTCCATCTTTACAGATACAAA 

TCTACTCCAATTCATAAATCTATTCATAAGTCTTTGGTACAAGTTTACATGATAAAAAGA 

1561 + + + + + + ibZU 

AGATGAGGTTAAGTATTTAGATAAGTATTCAGAAACCATGTTCAAATGTACTATTTTTCT 

AATGTGATTTGTCTTCCCTTCTTTGCACTTTTGAAATAAAGTATTTATC 

162 i + + + + 1669 

TTACACTAAACAGAAGGGAAGAAACGTGAAAACTTTATTTCATAAATAG 
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CTGCAGGAATTCGGCACGAGGGGTGCTATTGTGAGGCGGTTGTAGAAGTTAATAAAGGTA 

+ + + + + + 

GACGTCCTTAAGCCGTGCTCCCCACGATAACACTCCGCCAACATCTTCAATTATTTCCAT 



TCCATGGAGAACACTGAAAACTCAGTGGATTCAAAATCCATTAAAAATTTGGAACCAAAG 

61 + + + + + + 120 

AGGTACCTCTTGTGACTTTTGAGTCACCTAAGTTTTAGGTAATTTTTAAACCTTGGTTTC 

MetGluAsnThrGluAsnSerValAspSerLysSerlleLysAsnLeuGluProLys 

ATCATACATGGAAGCGAATCAATGGACTCTGGAATATCCCTGGACAACAGTTATAAAATG 

121 - + + + + + + 180 

TAGTATGTACCTTCGCTTAGTTACCTGAGACCTTATAGGGACCTGTTGTCAATATTTTAC 
IlelleHi sGl ySerGl uSerMetAspSerGI y I 1 eSerLeuAspAsnSerTy rLysMet 



+ + 240 



GATTATCCTGAGATGGGTTTATGTATAATAATTAATAATAAGAATTTTCATAAGAGCACT 

]^Q\ + ■+- + + 

CTAATAGGACTCTACCCAAATACATATTATTAATTATTATTCTTAAAAGTATTCTCGTGA. 
AspTyrProGluMetGlyLeuCysIlellelleAsnAsnLysAsnPheHisLysSerThr 

GGAATGACATCTCGGTCTGGTACAGATGTCGATGCAGCAAACCTCAGGGAAACATTCAGA' 

+ + + + + + 300 

CCTTACTGTAGAGCCAGACCATGTCTACAGCTACGTCGTTTGGAGTCCCTTTGTAAGTCT 
GlyMetThrSerArgSerGlyThrAspValAspAlaAl aAsnLeuArgGl uThrPheArg 

AACTTGAAATATGAAGTCAGGAATAAAAATGATCTTACACGTGAAGAAATTGTGGAATTG. 

30i + +-- -+ + + + 3 °0 

TTGAACTTTATACTTCAGTCCTTATTTTTACTAGAATGTGCACTTCTTTAACACCTTAAC 

AsnLeuLysTyrGl uVal ArgAsnLysAsnAspLeuThrArgGl uGluIl eVal Gl uLeu 

ATGCGTGATGTTTCTAAAGAAGATCACAGCAAAAGGAGCAGTTTTGTTTGTGTGCTTCTG 

361 + + + + + + 420 

TACGCACTACAAAGATTTCTTCTAGTGTCGTTTTCCTCGTGAAAACAAACACACGAAGAC 
MetArgAspVal SerLysGI uAspHi sSerLysArgSerSerPheVal CysVal LeuLeu 

AGCCATGGTGAAGAAGGAATAATTTTTGGAACAAATGGACCTGTTGACCTGAAAAAAATA 

421 -.- + + + + + + 480 

TCGGTACCACTTCTTCCTTATTAAAAACCTTGTTTACCTGGACAACTGGACTTTTTTTAT 

SerHi sGl yGl uGl uGlyllell ePheGlyThrAsnGI yProVal AspLeuLysLys 1 1 e 

ACAAACTTTTTCAGAGGGGATCGTTGTAGAAGTCTAACTGGAAAACCCAAACTTTTCATT 

481 . + + + + + + 540 

TGTTTGAAAAAGTCTCCCCTAGCAACATCTTCAGATTGACCTTTTGGGTTTGAAAAGTAA 

ThrAsnPhePheArgGlyAspArgCysArgSerLeuThrGlyLysProLysLeuPhelle 
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ATTCAGGCCTGCCGTGGTACAGAACTGGACTGTGGCATTGAGACAGACAGTGGTGTTGAT 

541 + + + + + + 600 
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